FPGA/CPLD design tips

This is a list of common mistakes made in design. These errors often make your design unreliable or slow. In order to improve your design performance and increase the reliability of your speed you must determine your design through all of these checks.

reliability

Select global clock buffer BUFG for clock signal
A clock that does not use the global clock buffer will introduce a bias.

Register data with only one clock edge

Using the two edges of the clock is unreliable because some or both edges of the clock will drift; if the clock drifts and you only use one edge of the clock, you reduce the risk of clock edge drift.
This problem can be solved by allowing CLKDLL to automatically correct the duty cycle of the clock to a 50% duty cycle. Otherwise it is strongly recommended that you only use one clock edge.

Do not generate a clock internally except for clocks generated with CLKDLL or DCM.
This includes generating a gated clock and a divided clock. Alternatively, a clock enable can be established or a different clock signal can be generated using CLKDLL or DCM.
For a purely synchronous design it is recommended that you use only one clock whenever possible.

reliability

Select global clock buffer BUFG for clock signal
A clock that does not use the global clock buffer will introduce a bias.

Register data with only one clock edge

Using the two edges of the clock is unreliable because some or both edges of the clock will drift; if the clock drifts and you only use one edge of the clock, you reduce the risk of clock edge drift.
This problem can be solved by allowing CLKDLL to automatically correct the duty cycle of the clock to a 50% duty cycle. Otherwise it is strongly recommended that you only use one clock edge.

Do not generate a clock internally except for clocks generated with CLKDLL or DCM.
This includes generating a gated clock and a divided clock. Alternatively, a clock enable can be established or a different clock signal can be generated using CLKDLL or DCM.
For a purely synchronous design, it is recommended that you use only one clock whenever possible. Do not generate asynchronous control signals internally, such as reset signals or asynchronous control signals generated internally by the set signal. Glitch can be used instead to generate a synchronous reset. Set signal This signal is decoded one clock cycle ahead of the time when it needs to be active.

Do not use multiple clocks without phase relationships

You may not always be able to avoid this condition. Under these circumstances, make sure you have used the appropriate synchronization circuit to cross the clock domain.

Do not use multiple clocks without phase relationships

Again you may not always be able to avoid this condition. Many designs need to be so in these cases to make sure that you have properly constrained the path across the clock domain.

Do not use internal latches

Internal latches can confuse timing and often introduce additional clock signals. Internal latches can be viewed as combinatorial logic when the transparent gate is open but can be considered a sync component when the gate is latched. This will confuse timing. Analysis of internal latches often introduces a gated clock gating clock that can cause glitches that make the design unreliable

performance

The logic level delay should not exceed 50% of the timing budget. Each path logic level delay can be found in the logic level timing report or post-layout timing report. After each path is analyzed, the timing analyzer will generate each path. The statistics of the delay check that the total logic level delay exceeds 50% of your timing budget?

IOB register

The IOB register provides the fastest clock-to-output and input-to-clock delay. First, there are some restrictions. For the input register, there can be no combined logic between the pin and the register. For the output register, there can be no combination between the register and the pin. Logic exists for the three-state output. All registers in the IOB must use the same clock signal and reset signal and the IOB tri-state register must be active low to put into the IOB. The tristate buffer is active low so in the register and three There is no need for an inverter between the state buffers. You must enable the software to select the IOB register. You can set the global implementation option to select the IOB register for the input/output or input and output. The default value is off.
You can also set in the synthesis tool or in the user constraint file UCF to enable the use of the IOB register syntax: INST IOB = TRUE;

Choose a fast conversion rate for critical outputs

The slew rate can be selected for LVCMOS and LVTTL levels. The fast slew rate will reduce the output delay but will increase the ground bounce so you must choose the fast slew rate based on careful consideration.

Flow logic

If your design allows for increased latency, the pipelined operation of the combinatorial logic can improve performance. There are a large number of registers in Xilinx FPGAs. There is a corresponding register for each four-input function generator. These registers are used to increase data in the case of sacrificial delay. Throughput

Code optimization for a four-input lookup table structure

Remember that each lookup table can create a four-input combinatorial logic function. If you need more functionality, remember the number of lookup tables needed to implement the function.

Use a Case statement instead of an if-then-else statement

Complex if-then-else statements usually generate priority decoding logic. This will increase the combined delay on these paths. Case statements used to generate complex logic will typically generate parallel logic without too much delay for Verilog users. You can use the compile wizard synopsys parallel_case

Use one or more core generator blocks

The kernel generator block is optimized for Xilinx architecture. Many blocks allow user configuration including size and pipeline delay. View critical paths in your design. Can you generate a core in the core generator to improve keypath performance?

Keep the finite state machine FSM at the level of your own level

In order to allow the synthesis tool to fully optimize your FSM it must be optimized in its own block. If this is not the case, this will allow the synthesis tool to optimize the FSM logic along with the logic around it.
FSM cannot include any arithmetic logic data path logic or other combinational logic that is not related to the state machine

Finite state machine using two processes or always blocks

The next state and output decode logic must be placed in a separate process or always block. This will not allow the synthesis tool to share resources between the output and the next state decode logic.

Use a valid encoding finite state machine FSM

A valid encoding usually provides the highest performance state machine in a register-rich FPGA

Provides a registered output for each leaf-level leaf-level block

A leaf-level block is a block that can infer logic and a structural-level block only instantiates a lower-level block. This establishes a hierarchy. If the leaf-level block is latched, the synthesis tool can preserve the hierarchy. This allows analysis. The static timing of these codes becomes easier to register the boundaries so that there is a certain timing relationship between the blocks.

Use data streams with appropriate pin positioning constraints

The reason why the data stream in the Xilinx device is in the horizontal direction is that there is another reason why the carry chain is in the vertical direction. The tristate buffer line is also horizontally connected directly between the blocks in the horizontal direction. In order to utilize the data stream address and data pins, it must be placed on the left or right side of the chip. At the same time, because the carry chain is bottom-up, the lowest bit is placed at the bottom of the control signal on the upper and lower parts of the chip.

Different counter styles

The binary counter is very slow. If your binary counter is a critical path, consider using a different style of counter LFSRPre-scalar or Johnson.

The design is hierarchical and is divided into different functional blocks and technical blocks.

The design must be divided into different functional blocks. First, the top-level functional blocks and then the lower-level blocks. You should also include specific technology blocks. Design hierarchy must make the design more readable, easier to debug, and easier to reuse.

Duplicated high fanout network

This can be controlled by your synthesis tool. However, in order to control copying more tightly, you can choose to copy the register.

Use four global constraints to globally constrain the design cycle for each clock bias input bias output pin-to-pin you may have other constraints for multi-cycle path failure paths and critical paths but you must always To start with specifying four global constraints

Refrigerators and Freezers

tcl , https://www.tclgroupss.com