Thilina's Blog

I might be wrong, but…

Using Xilinx Core Generator – Division in FPGA

Xilinx ISE comes with a number of cores which can be used with their products. While we are working on our Final Year Project at the university we used a number of these cores to make our work easy. In this article I will share my experiences on Xilinx core generator by implementing a division core as an example.

Verilog HDL supports division by a constant such as

· (constant)/ (constant) or

· (variable)/(constant with power of two manner)

But when it goes to division of variable by variable or constant by variable it will not support a simple division and must use separate core (using Xilinx core generator or an advanced module designed by someone else). Look at following example.

module simpleDivision(
  input clk;
  input [7:0] div;
  output reg [7:0] quo = 0;
  always @ (posedge clk)
    quo = div/4;

I used following test bench program to simulate above code and observed that the results will come within single clock cycle as below.

module test_simpleDiv;
  reg [7:0] div;
  reg clk;
  wire [7:0] quo;
  simpleDivision uut (
  initial begin
    div = 0;
    clk = 0;
    #10 div = 20;
    #10 div = 5;
    #10 div = 121;
    #10 div = 13;
    #30 $finish;
  always #5 clk = ~clk;

Note that the line

always #5 clk = ~clk;

will generate the clock signal with the period of 10ns. Result was as below.


When It goes to (variable/variable) division, Xilinx core generator supports a division core with two modes which can be used with division,

· Radix2 mode

· High Radix mode

I’ll discuss how to generate the radix division core using Xilinx core generator, add a simple wrapper to the core and simulation results of the division.

Division Core with Radix2 algorithm

1. Right Click on your project hierarchy and select new source.


2. From the next window select “IP (CORE Generator Architecture Wizard)”, put a file name and click next.


3. Then expand “Math Functions”, “Dividers”, “Divider Generator” and click next.


4. After some time (depends on the performance of your PC) the core generator will Start. I used following configuration for the core I used in this post.

· Algorithm type: radix2

· Remainder type: remainder

· Operand sign : unsigned

· Divider, Quotient : 10 bits

· Divisor width: 8 bits

· Clocks per Division: 1

You can view the data sheet of the core by clicking the “Datasheet” button and click “Generate button” to generate the core.


Again after some time your core will be generated. Next step is to create wrapper module to hide the complexity of the core. In this case since the core is not much complex (high radix is bit complex than this in my opinion) the wrapper will be simple. You can view the HDL functional model and HDL instantiation template by clicking the options in design tab.


You can use the timing diagram from the data sheet do design your wrapper. Timing model for the above is as below.


Therefore the wrapper and the test bench code will be as below. According to the timing diagram the input data are taken at the positive edge of the clock signal when the rfd flag (ready for data) is up Therefore we can add an always block on clk signal with rfd condition to load data to the dividend and divisor (which is not coded in here). At the same time we can use the positive or negative edge of the clock to load output to the registers. In here we do not use the CE signal.

module divCoreWrap(
  input clk;
  input [7:0] dataInDvs;
  input [9:0] dataInDvd;
  output reg [9:0] dataOutquo;
  output reg [7:0] dataOutrem;
  output ready;
  wire [9:0] quotient;
  wire [7:0] fractional;
  wire rfd;
  always@(posedge clk) begin
    dataOutquo <= quotient;
    dataOutrem <= fractional;
  dividerModule YourInstanceName (
    .clk(clk), // input clk
    .rfd(ready), // ouput rfd
    .dividend(dataInDvd), // input [9 : 0] dividend
    .divisor(dataInDvs), // input [7 : 0] divisor
    .quotient(quotient), // ouput [9 : 0] quotient
    .fractional(fractional)); // ouput [7 : 0] fractional


module test_001;
  reg clk;
  reg [9:0] dataInDvd;
  reg [7:0] dataInDvs;
  wire [9:0] dataOutquo;
  wire [7:0] dataOutrem;
  wire ready;
  divCoreWrap uut (
  initial begin
    clk = 0;
    dataInDvd = 0;
    dataInDvs = 0;
    dataInDvd = 105;
    dataInDvs = 100;
    dataInDvd = 529;
    dataInDvs = 10;
    #200 $finish;
  always #5 clk = ~clk;


As you can see the result will be available at the output after 12 clock cycles after placing data on the input buses.

Hope you have a basic Idea in using Xilinx core generator, writing a wrapper and simulations. Thank you very much for reading.

2011 June 6 - Posted by | Electronics, FPGA, Technology


  1. මට නම් හොඳටම තේරුණා 😉

    Comment by deeps | 2011 June 6 | Reply

    • හී හී… 😀 😀

      Comment by Thilina S. | 2011 June 6 | Reply

    • මේවා ඉතින් කැන්ටිමෙන් කෑම කන තරම් ලේසි නෑ. 😛

      Comment by Akila | 2011 June 7 | Reply

  2. Hello

    I’m trying to implement a divisor core, I tried to following steps which are shown above, but I failed. Can you add a more detailed tutorial? I couldn’t understand where you are calling modules ( I couldn’t found where you called required modules ), what is wrap etc. I’m using Digilent Basys 2 development board, and I dont simulate my code, I will synthsise it and run on BASYS2, this simulation modules also confused me. If you can add a tutorial step by step ( adding new module to empty project, adding IP core, calling IP core etc ), I will be happy. Thanks.

    Mert Solkıran

    Comment by Mert Solkıran | 2011 July 31 | Reply

  3. goood one dear

    Comment by tammna | 2012 October 25 | Reply

  4. warning msg is given as….The chosen IP does not support a Verilog behavioral model,
    generating a Verilog structural model instead.
    can u suggest me….

    Comment by ms kul | 2012 December 7 | Reply

  5. Can you explain why the latency shown in divider generator wizard is 10 but output is available 12 cycles after placing input?

    Comment by Anh Chu | 2013 October 29 | Reply

    • Ok, I know now, the dividend width is 10 (which is not shown in the wizard), so the latency is 10 + 2 = 12.
      Thanks for your nice tutorial.

      Comment by Anh Chu | 2013 October 29 | Reply

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: