FMUSER Wirless Transmit Video And Audio More Easier !

[email protected] WhatsApp +8618078869184
Language

    Talking about TMS320C6748 Minimum System Design and Cache Configuration

     

    "This paper is mainly about the introduction of tms320c6748, and focuses on the detailed description of the structure of tms320c6748. tms320c6748 Tms320c6748 is a new high-performance processor with floating-point function launched by Texas Instruments (TI). This chip is also the floating-point digital signal processor (DSP) with the lowest power consumption in the industry, which can fully meet the requirements of high energy efficiency and connectivity design for highly integrated peripherals, lower heat dissipation and longer battery life. It not only has universal parallel port (UPP), but also is the first device of Ti to integrate serial advanced technology attachment (SATA). The tl6748-evm evaluation suite launched by Guangzhou Chuang long provides a perfect software development environment for developers to use TI tms320c6748 processor. The system supports bare metal, Sys / BIOS and DSP / BIOS. Provide reference backplane schematic diagram, DSP c6748 introductory tutorial, rich demo programs, complete software development packages, and detailed c6748 system development documents to facilitate users to quickly evaluate tms320c6748 processor, design system driver and its customized application software, greatly reduce product development cycle, and quickly market customer products. It is mainly for data acquisition and processing industries such as power, communication, industrial control, audio and video processing, etc. Tl6748-evm evaluation suite is a development board with rich functions. It provides Embedded designers with a fast and simple practical way to evaluate tms320c674x series processors. It is a complete experimental evaluation platform. Texas Instruments (TI) has launched a new high-performance processor with floating-point function - tms320c6748. This chip is also the floating-point digital signal processor (DSP) with the lowest power consumption in the industry, which can fully meet the requirements of high energy efficiency and connectivity design for highly integrated peripherals, lower heat dissipation and longer battery life. It not only has universal parallel port (UPP), but also is the first device of Ti to integrate serial advanced technology attachment (SATA). The tl6748-evm evaluation suite launched by Guangzhou Chuang long provides a perfect software development environment for developers to use TI tms320c6748 processor. The system supports bare metal, Sys / BIOS and DSP / BIOS. Provide reference backplane schematic diagram, DSP c6748 introductory tutorial, rich demo programs, complete software development packages, and detailed c6748 system development documents to facilitate users to quickly evaluate tms320c6748 processor, design system driver and its customized application software, greatly reduce product development cycle, and quickly market customer products. It is mainly for data acquisition and processing industries such as power, communication, industrial control, audio and video processing, etc. Tl6748-evm evaluation suite is a development board with rich functions. It provides Embedded designers with a fast and simple practical way to evaluate tms320c674x series processors. It is a complete experimental evaluation platform. Specifications / tms320c6748 processor Tms320c6748 Atlas tms320c6748, C6000 Series floating-point DSP processor (pin to pin compatible with omapl138, am1808 processor) Main frequency: 456mhz storage 128M / 256mbyte industrial DDR2 128M / 256M / 512mbyte industrial NAND flash Audio / video interface 1 3.5mm line in audio input interface 1 3.5mm mic in audio input interface 1 3.5mm line out audio output interface Peripheral Interface Serial port, USB interface: Uart1, 3-wire serial port, RS232 level UART2, 3-wire serial port, RS232 level 1 RS485 interface (RS485 and uart1 multiplexed) 1 x USB2.0 OTG (Mini usb-b female connector) 4 x USB1.1 host (2 x double-layer USB type a seat) Other peripheral interfaces: 1 standard SD card holder (SD / MMC signal, supporting 1.8V and 3.3V logic voltage) 1 10 / 100Mbps Ethernet interface (RJ45 connector) 1 SATA interface (7pin SATA hard disk interface) 1 RTC seat (rc1220 RTC seat) 1 JTAG interface (14pin Ti Rev B JTAG interface) Expansion interface: (2x idc3 simple horn seat (2x 25pin specification)) 1 SPI bus 1-way IIC bus 1 channel UART signal 1-way vpif bus 1-way UPP bus 1 channel EMIF bus Multiple gpios can be set Input interface 1 unshielded interrupt key 1 programmable key 1 reset button LED indicator 2 power indicators 1 system indicator 4 programmable indicators Display unit One VGA interface A 24bit true color LCD interface (50pin FPC row seat); Including 4-wire resistive touch screen interface, resolution can support 1366x768) Electrical parameters Working voltage: 3.8V ~ 5.5V (typical value: 5V) Operating temperature: - 45 ~ 85 ℃ Ambient humidity: 20% - 90%, non condensing Development board: 180mm * 130mm Core plate: 55mm * 33mm Electrical indicators:+ 5V@150mA Core board power consumption Voltage: 5V Current: 98ma Power consumption: 0.49w On the minimum system design and cache configuration of tms320c6748 The processor cache is a high-speed storage area that stores data close to the processor. This facilitates fast access to commonly used instructions and data, thereby improving computing performance. Cache can be regarded as flat memory, that is, cache is the memory close to the CPU that can be accessed quickly Storage organization structure The model on the left of Figure 1 is a flat memory system architecture. It is assumed that both the CPU and the on-chip storage space run at 300 MHz, and the storage access delay exists only when the CPU accesses the external memory, while the memory stall does not occur when accessing the on-chip memory area. If the CPU frequency is 600 MHz, there is still a waiting period when accessing this part of the memory area. Unfortunately, it will be very expensive to achieve a large enough memory area in the chip to run at 600 MHz. If the memory area in the chip is still running at 300 MHz, there will be a cycle delay for accessing these memory areas. One solution is to use a hierarchical storage architecture. There is a fast storage area close to the CPU. There is no stall, but the size is very small. The external memory space is large, but it is far from the CPU. The access needs a large stall. The storage area close to the CPU can be regarded as a cache. Law of access location Of course, this solution is only effective when most of the CPU accesses only the storage area closest to it. Fortunately, this can be guaranteed according to the law of access location. The location rule of access shows that the program only needs a relatively small size of data and code in a relatively small time window. Two laws of data positioning: • spatial relevance: when a data is accessed, its adjacent data is likely to be accessed by subsequent storage; • time correlation: when a storage area is accessed, it will be accessed at the next adjacent point in time. Spatial correlation reveals the creation law of computer programs: usually, relevant data are compiled and linked to adjacent continuous areas. For example, processing the first element of an array first and then the second is spatial correlation. Similarly, the time correlation is mainly due to the existence of loops that take up a lot of time in the program. Usually, the code of the loop is executed continuously many times, and the data accessed in the loop is also quite large. Fig. 2 is an illustration of spatial correlation, the data access mode of a 6-tap FIR filter. If the output y [0] is calculated, six sampling points are read from the input buffer x []. When the first access occurs, the cache controller reads x [0] and several data of subsequent addresses (depending on the length of the cache line). Loading the data of a cache line from a slow memory requires a certain clock cycle CPU stall. One motivation for this loading is that the subsequent data of x [0] will be accessed later. This is obvious for FIR filter, because the following five sampling points (x [1] - x [5]) will be used. For the next five storage accesses, you only need to access the high-speed cache. When calculating the next output y [1], five sampling points (x [1] - x [5]) can be reused, and only one sampling point (x [6]) needs to be reloaded. All sampling points are in the cache, and there will be no CPU stall during access, which is the time correlation mentioned just now, that is, the data used in the previous step may still be used in the next processing. Cache is established by using the temporal and spatial correlation of data access. It reduces the number of accesses to the slow external memory as much as possible, and allows most data access to be completed by the higher-level cache storage area. Speed of storage area The cache system usually includes the following three levels: • the first stage (L1) is in the CPU chip and runs at the CPU clock frequency; • the second stage (L2) is also in the chip, but it is slightly slower than L1 and has a larger capacity than L1; • the third level (L3) is external memory with the slowest and largest capacity. Each level of cache has different data access performance. For relative performance comparison, please refer to the following table. When the processor requests data access from the storage area, it first looks up in the highest-level cache, and then looks up from the secondary high-level storage area. When the request is in the cache, it is a cache hit, otherwise it is a cache miss. Therefore, the performance of the cache system will depend on the cache hit ratio. For any level of cache, the higher the hit rate, the better the performance. For example, the L1 cache hit rate of a memory access is 70%, L2 is 20%, and the others are from L3. Under the performance shown in Figure 3, the average memory access time is (0.7 * 4) + (0.2 * 5) + (0.05 * 30) + (0.05 * 220) = 16.30 ns considering the storage architecture of Ti tms320c64x DSP shown in Figure 4, two-level on-chip cache plus off chip external memory. The first level cache is divided into program (l1p) and data (L1D) caches, each with a capacity of 16 Kbytes. L1 cache data access will not have a storage stall. L2 storage area is divided into L2 SRAM and L2 cache. No matter what configuration, L2 storage area needs two CPU cycles to complete one data access. Different DSPs have different capacities of L2. For example, tms320c6454 DSP has a size of 1mbytes. Finally, the C64x DSP has an external memory with a maximum of 2gbytes. The access speed of the external memory depends on the memory type used, but the frequency of the external memory is generally about 100MHz. All caches (red) and data paths in Figure 4 are automatically maintained by the cache controller. Cache update Cache is always a copy of main memory, so it needs to reflect the contents of main memory at any time. If the data is updated in the cache but not in the main memory, the data in the cache is called dirty data, and the data is updated in the main memory but not in the cache. At this time, the data in the cache is called stale data. The cache controller uses a series of techniques to maintain cache consistency. Listening to "snoop" and forced updating "snarf" are two common technologies. Listening is to let the cache decide that the processing of data in main memory affects the data of the cached address. Forced update is to copy data from main memory to cache memory. The cache is usually much smaller than the main memory capacity, so the cache will always be filled. At this time, the newly entered data will always replace the data already in the cache. There are many strategies to replace and update the data already in the cache, such as random replacement, first in first out (FIFO) and the latest unused strategy (LRU). Most processors use LRU, that is, replace the least recently used data with the latest data. This strategy is very effective because it takes into account the time correlation of data access. Directly mapped cache Caches are either directly mapped "direct mapped" or group related "set associative". Figure 5 shows the l1p cache of C64x, which contains 512 32-byte cache lines. Each external memory address is always mapped to the same cache line, such as: • addresses 0000H to 001fh are always mapped to cache line 0 • addresses 0020h to 003fh are always mapped to cache line 1 • addresses 3fe0h to 3fffh are always mapped to cache line 511 When the address 4000H is accessed, the cache capacity is fully occupied, so the addresses 4000H to 401fh are mapped to cache line 0 In order to save the data information copied from external memory, the cache line of each l1p contains the following information: • significant bit, indicating whether the current cache line contains valid data; • the label area corresponds to the upper 18 bits of the external memory address. Since the data of each cache line can be copied from several external memory addresses, for example, line 0 can store data from addresses 0000H to 001fh or from addresses 4000H to 401fh. • group number, corresponding to 5 to 13 of the address

     

     

     

     

    List all Question

    Nickname

    Email

    Questions

    Our other product:

    Professional FM Radio Station Equipment Package

     



     

    Hotel IPTV Solution

     


      Enter email  to get a surprise

      fmuser.org

      es.fmuser.org
      it.fmuser.org
      fr.fmuser.org
      de.fmuser.org
      af.fmuser.org ->Afrikaans
      sq.fmuser.org ->Albanian
      ar.fmuser.org ->Arabic
      hy.fmuser.org ->Armenian
      az.fmuser.org ->Azerbaijani
      eu.fmuser.org ->Basque
      be.fmuser.org ->Belarusian
      bg.fmuser.org ->Bulgarian
      ca.fmuser.org ->Catalan
      zh-CN.fmuser.org ->Chinese (Simplified)
      zh-TW.fmuser.org ->Chinese (Traditional)
      hr.fmuser.org ->Croatian
      cs.fmuser.org ->Czech
      da.fmuser.org ->Danish
      nl.fmuser.org ->Dutch
      et.fmuser.org ->Estonian
      tl.fmuser.org ->Filipino
      fi.fmuser.org ->Finnish
      fr.fmuser.org ->French
      gl.fmuser.org ->Galician
      ka.fmuser.org ->Georgian
      de.fmuser.org ->German
      el.fmuser.org ->Greek
      ht.fmuser.org ->Haitian Creole
      iw.fmuser.org ->Hebrew
      hi.fmuser.org ->Hindi
      hu.fmuser.org ->Hungarian
      is.fmuser.org ->Icelandic
      id.fmuser.org ->Indonesian
      ga.fmuser.org ->Irish
      it.fmuser.org ->Italian
      ja.fmuser.org ->Japanese
      ko.fmuser.org ->Korean
      lv.fmuser.org ->Latvian
      lt.fmuser.org ->Lithuanian
      mk.fmuser.org ->Macedonian
      ms.fmuser.org ->Malay
      mt.fmuser.org ->Maltese
      no.fmuser.org ->Norwegian
      fa.fmuser.org ->Persian
      pl.fmuser.org ->Polish
      pt.fmuser.org ->Portuguese
      ro.fmuser.org ->Romanian
      ru.fmuser.org ->Russian
      sr.fmuser.org ->Serbian
      sk.fmuser.org ->Slovak
      sl.fmuser.org ->Slovenian
      es.fmuser.org ->Spanish
      sw.fmuser.org ->Swahili
      sv.fmuser.org ->Swedish
      th.fmuser.org ->Thai
      tr.fmuser.org ->Turkish
      uk.fmuser.org ->Ukrainian
      ur.fmuser.org ->Urdu
      vi.fmuser.org ->Vietnamese
      cy.fmuser.org ->Welsh
      yi.fmuser.org ->Yiddish

       
  •  

    FMUSER Wirless Transmit Video And Audio More Easier !

  • Contact

    Address:
    No.305 Room HuiLan Building No.273 Huanpu Road Guangzhou China 510620

    E-mail:
    [email protected]

    Tel / WhatApps:
    +8618078869184

  • Categories

  • Newsletter

    FIRST OR FULL NAME

    E-mail

  • paypal solution  Western UnionBank OF China
    E-mail:[email protected]   WhatsApp:+8618078869184   Skype:sky198710021 Chat with me
    Copyright 2006-2020 Powered By www.fmuser.org

    Contact Us