VPR-7.0
 All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Macros
path_delay.c
Go to the documentation of this file.
1 #include <stdio.h>
2 #include <string.h>
3 #include <limits.h>
4 #include <math.h>
5 #include "util.h"
6 #include "vpr_types.h"
7 #include "globals.h"
8 #include "path_delay.h"
9 #include "path_delay2.h"
10 #include "net_delay.h"
11 #include "vpr_utils.h"
12 #include <assert.h>
13 #include "read_xml_arch_file.h"
14 #include "ReadOptions.h"
15 #include "read_sdc.h"
16 #include "stats.h"
17 
18 /**************************** Top-level summary ******************************
19 
20 Timing analysis by Vaughn Betz, Jason Luu, and Michael Wainberg.
21 
22 Timing analysis is a three-step process:
23 1. Interpret the constraints specified by the user in an SDC (Synopsys Design
24 Constraints) file or, if none is specified, use default constraints.
25 2. Convert the pre- or post-packed netlist (depending on the stage of the
26 flow) into a timing graph, where nodes represent pins and edges represent
27 dependencies and delays between pins (see "Timing graph structure", below).
28 3. Traverse the timing graph to obtain information about which connections
29 to optimize, as well as statistics for the user.
30 
31 Steps 1 and 2 are performed through one of two helper functions:
32 alloc_and_load_timing_graph and alloc_and_load_pre_packing_timing_graph.
33 The first step is to create the timing graph, which is stored in the array
34 tnode ("timing node"). This is done through alloc_and_load_tnodes (post-
35 packed) or alloc_and_load_tnodes_from_prepacked_netlist. Then, the timing
36 graph is topologically sorted ("levelized") in
37 alloc_and_load_timing_graph_levels, to allow for faster traversals later.
38 
39 read_sdc reads the SDC file, interprets its contents and stores them in
40 the data structure g_sdc. (This data structure does not need to remain
41 global but it is probably easier, since its information is used in both
42 netlists and only needs to be read in once.)
43 
44 load_clock_domain_and_clock_and_io_delay then gives each flip-flop and I/O
45 the index of a constrained clock from the SDC file in g_sdc->constrained_
46 clocks, or -1 if an I/O is unconstrained.
47 
48 process_constraints does a pre-traversal through the timing graph and prunes
49 all constraints between domains that never intersect so they are not analysed.
50 
51 Step 3 is performed through do_timing_analysis. For each constraint between
52 a pair of clock domains we do a forward traversal from the "source" domain
53 to the "sink" domain to compute each tnode's arrival time, the time when the
54 latest signal would arrive at the node. We also do a backward traversal to
55 compute required time, the time when the earliest signal has to leave the
56 node to meet the constraint. The "slack" of each sink pin on each net is
57 basically the difference between the two times.
58 
59 If path counting is on, we calculate a forward and backward path weight in
60 do_path_counting. These represent the importance of paths fanning,
61 respectively, into and out of this pin, in such a way that paths with a
62 higher slack are discounted exponentially in importance.
63 
64 If path counting is off and we are using the pre-packed netlist, we also
65 calculate normalized costs for the clusterer (normalized arrival time, slack
66 and number of critical paths) in normalized_costs. The clusterer uses these
67 to calculate a criticality for each block.
68 
69 Finally, in update_slacks, we calculate the slack for each sink pin on each net
70 for printing, as well as a derived metric, timing criticality, which the
71 optimizers actually use. If path counting is on, we calculate a path criticality
72 from the forward and backward weights on each tnode. */
73 
74 /************************* Timing graph structure ****************************
75 
76 Author: V. Betz
77 
78 We can build timing graphs that match either the primitive (logical_block)
79 netlist (of basic elements before clustering, like FFs and LUTs) or that
80 match the clustered netlist (block). You pass in the is_pre_packed flag to
81 say which kind of netlist and timing graph you are working with.
82 
83 Every used (not OPEN) block pin becomes a timing node, both on primitive
84 blocks and (if you’re building the timing graph that matches a clustered
85 netlist) on clustered blocks. For the clustered (not pre_packed) timing
86 graph, every used pin within a clustered (pb_type) block also becomes a timing
87 node. So a CLB that contains ALMs that contains LUTs will have nodes created
88 for the CLB pins, ALM pins and LUT pins, with edges that connect them as
89 specified in the clustered netlist. Unused (OPEN) pins don't create any
90 timing nodes.
91 
92 The primitive blocks have edges from the tnodes that represent input pins and
93 output pins that represent the timing dependencies within these lowest level
94 blocks (e.g. LUTs and FFs and IOs). The exact edges created come from reading
95 the architecture file. A LUT for example will have delays specified from all
96 its inputs to its output, and hence we will create a tedge from each tnode
97 corresponding to an input pin of the LUT to the tnode that represents the LUT
98 output pin. The delay marked on each edge is read from the architecture file.
99 
100 A FF has nodes representing its pins, but also two extra nodes, representing
101 the TN_FF_SINK and TN_FF_SOURCE. Those two nodes represent the internal storage
102 node of the FF. The TN_FF_SINK has edges going to it from the regular FF inputs
103 (but not the clock input), with appropriate delays. It has no outgoing edges,
104 since it terminates timing paths. The TN_FF_SOURCE has an edge going from it to
105 the FF output pin, with the appropriate delay. TN_FF_SOURCES have no edges; they
106 start timing paths. FF clock pins have no outgoing edges, but the delay to
107 them can be looked up, and is used in the timing traversals to compute clock
108 delay for slack computations.
109 
110 In the timing graph I create, input pads and constant generators have no
111 inputs (incoming edges), just like TN_FF_SOURCES. Every input pad and output
112 pad is represented by two tnodes -- an input pin/source and an output pin/
113 sink. For an input pad the input source comes from off chip and has no fanin,
114 while the output pin drives signals within the chip. For output pads, the
115 input pin is driven by signal (net) within the chip, and the output sink node
116 goes off chip and has no fanout (out-edges). I need two nodes to respresent
117 things like pads because I mark all delay on tedges, not on tnodes.
118 
119 One other subblock that needs special attention is a constant generator.
120 This has no used inputs, but its output is used. I create an extra tnode,
121 a dummy input, in addition to the output pin tnode. The dummy tnode has
122 no fanin. Since constant generators really generate their outputs at T =
123 -infinity, I set the delay from the input tnode to the output to a large-
124 magnitude negative number. This guarantees every block that needs the
125 output of a constant generator sees it available very early.
126 
127 The main structure of the timing graph is given by the nodes and the edges
128 that connect them. We also store some extra information on tnodes that
129 (1) lets us figure out how to map from a tnode back to the netlist pin (or
130 other item) it represents and (2) figure out what clock domain it is on, if
131 it is a timing path start point (no incoming edges) or end point (no outgoing
132 edges) and (3) lets us figure out the delay of the clock to that node, if it
133 is a timing path start or end point.
134 
135 To map efficiently from tedges back to the netlist pins, we create the tedge
136 array driven by a tnode the represents a netlist output pin *in the same order
137 as the netlist net pins*. That means the edge index for the tedge array from
138 such a tnode guarantees iedge = net_pin_index 1. The code to map slacks
139 from the timing graph back to the netlist relies on this. */
140 
141 /*************************** Global variables *******************************/
142 
143 t_tnode *tnode = NULL; /* [0..num_tnodes - 1] */
144 int num_tnodes = 0; /* Number of nodes (pins) in the timing graph */
145 
146 /******************** Variables local to this module ************************/
147 
148 #define NUM_BUCKETS 5 /* Used when printing slack and criticality. */
149 
150 /* Variables for "chunking" the tedge memory. If the head pointer in tedge_ch is NULL, *
151  * no timing graph exists now. */
152 
153 static t_chunk tedge_ch = {NULL, 0, NULL};
154 
155 static struct s_net *timing_nets = NULL;
156 
157 static int num_timing_nets = 0;
158 
159 static t_timing_stats * f_timing_stats = NULL; /* Critical path delay and worst-case slack per constraint. */
160 
161 static int * f_net_to_driver_tnode;
162 /* [0..num_nets - 1]. Gives the index of the tnode that drives each net.
163 Used for both pre- and post-packed netlists. If you just want the number
164 of edges on the driver tnode, use:
165  num_edges = timing_nets[inet].num_sinks;
166 instead of the equivalent but more convoluted:
167  driver_tnode = f_net_to_driver_tnode[inet];
168  num_edges = tnode[driver_tnode].num_edges;
169 Only use this array if you want the actual edges themselves or the index
170 of the driver tnode. */
171 
172 /***************** Subroutines local to this module *************************/
173 
174 static t_slack * alloc_slacks(void);
175 
176 static void update_slacks(t_slack * slacks, int source_clock_domain, int sink_clock_domain, float criticality_denom,
177  boolean update_slack);
178 
179 static void alloc_and_load_tnodes(t_timing_inf timing_inf);
180 
181 static void alloc_and_load_tnodes_from_prepacked_netlist(float block_delay,
182  float inter_cluster_net_delay);
183 
184 static void alloc_timing_stats(void);
185 
186 static float do_timing_analysis_for_constraint(int source_clock_domain, int sink_clock_domain,
187  boolean is_prepacked, boolean is_final_analysis, long * max_critical_input_paths_ptr,
188  long * max_critical_output_paths_ptr);
189 
190 #ifdef PATH_COUNTING
191 static void do_path_counting(float criticality_denom);
192 #endif
193 
194 static void do_lut_rebalancing();
195 static void load_tnode(INP t_pb_graph_pin *pb_graph_pin, INP int iblock,
196  INOUTP int *inode, INP t_timing_inf timing_inf);
197 
198 #ifndef PATH_COUNTING
199 static void update_normalized_costs(float T_arr_max_this_domain, long max_critical_input_paths,
200  long max_critical_output_paths);
201 #endif
202 
203 static void print_primitive_as_blif (FILE *fpout, int iblk);
204 
205 static void set_and_balance_arrival_time(int to_node, int from_node, float Tdel, boolean do_lut_input_balancing);
206 
207 static void load_clock_domain_and_clock_and_io_delay(boolean is_prepacked);
208 
209 static char * find_tnode_net_name(int inode, boolean is_prepacked);
210 
211 static t_tnode * find_ff_clock_tnode(int inode, boolean is_prepacked);
212 
213 static inline int get_tnode_index(t_tnode * node);
214 
215 static inline boolean has_valid_T_arr(int inode);
216 
217 static inline boolean has_valid_T_req(int inode);
218 
219 static int find_clock(char * net_name);
220 
221 static int find_input(char * net_name);
222 
223 static int find_output(char * net_name);
224 
225 static int find_cf_constraint(char * source_clock_name, char * sink_ff_name);
226 
227 static void propagate_clock_domain_and_skew(int inode);
228 
229 static void process_constraints(void);
230 
231 static void print_global_criticality_stats(FILE * fp, float ** criticality, const char * singular_name, const char * capitalized_plural_name);
232 
233 static void print_timing_constraint_info(const char *fname);
234 
235 static void print_spaces(FILE * fp, int num_spaces);
236 
237 /********************* Subroutine definitions *******************************/
238 
240 
241  /* This routine builds the graph used for timing analysis. Every cb pin is a
242  * timing node (tnode). The connectivity between pins is *
243  * represented by timing edges (tedges). All delay is marked on edges, not *
244  * on nodes. Returns two arrays that will store slack values: *
245  * slack and criticality ([0..num_nets-1][1..num_pins]). */
246 
247  /* For pads, only the first two pin locations are used (input to pad is first,
248  * output of pad is second). For CLBs, all OPEN pins on the cb have their
249  * mapping set to OPEN so I won't use it by mistake. */
250 
251  int num_sinks;
252  t_slack * slacks = NULL;
253  boolean do_process_constraints = FALSE;
254 
255  if (tedge_ch.chunk_ptr_head != NULL) {
256  vpr_printf(TIO_MESSAGE_ERROR, "in alloc_and_load_timing_graph: An old timing graph still exists.\n");
257  exit(1);
258  }
260  timing_nets = clb_net;
261 
262  alloc_and_load_tnodes(timing_inf);
263 
265 
266  check_timing_graph(num_sinks);
267 
268  slacks = alloc_slacks();
269 
270  if (g_sdc == NULL) {
271  /* the SDC timing constraints only need to be read in once; *
272  * if they haven't been already, do it now */
273  read_sdc(timing_inf);
274  do_process_constraints = TRUE;
275  }
276 
278 
279  if (do_process_constraints)
281 
282  if (f_timing_stats == NULL)
284 
285  return slacks;
286 }
287 
289  float inter_cluster_net_delay, t_model *models, t_timing_inf timing_inf) {
290 
291  /* This routine builds the graph used for timing analysis. Every technology-
292  * mapped netlist pin is a timing node (tnode). The connectivity between pins is *
293  * represented by timing edges (tedges). All delay is marked on edges, not *
294  * on nodes. Returns two arrays that will store slack values: *
295  * slack and criticality ([0..num_nets-1][1..num_pins]). */
296 
297  /* For pads, only the first two pin locations are used (input to pad is first,
298  * output of pad is second). For CLBs, all OPEN pins on the cb have their
299  * mapping set to OPEN so I won't use it by mistake. */
300 
301  int num_sinks;
302  t_slack * slacks = NULL;
303  boolean do_process_constraints = FALSE;
304 
305  if (tedge_ch.chunk_ptr_head != NULL) {
306  vpr_printf(TIO_MESSAGE_ERROR, "in alloc_and_load_timing_graph: An old timing graph still exists.\n");
307  exit(1);
308  }
309 
311  timing_nets = vpack_net;
312 
314  inter_cluster_net_delay);
315 
317 
318  slacks = alloc_slacks();
319 
320  check_timing_graph(num_sinks);
321 
324  }
325 
326  if (g_sdc == NULL) {
327  /* the SDC timing constraints only need to be read in once; *
328  * if they haven't been already, do it now */
329  read_sdc(timing_inf);
330  do_process_constraints = TRUE;
331  }
332 
334 
335  if (do_process_constraints)
337 
338  if (f_timing_stats == NULL)
340 
341  return slacks;
342 }
343 
344 static t_slack * alloc_slacks(void) {
345 
346  /* Allocates the slack, criticality and path_criticality structures
347  ([0..num_nets-1][1..num_pins-1]). Chunk allocated to save space. */
348 
349  int inet;
350  t_slack * slacks = (t_slack *) my_malloc(sizeof(t_slack));
351 
352  slacks->slack = (float **) my_malloc(num_timing_nets * sizeof(float *));
353  slacks->timing_criticality = (float **) my_malloc(num_timing_nets * sizeof(float *));
354 #ifdef PATH_COUNTING
355  slacks->path_criticality = (float **) my_malloc(num_timing_nets * sizeof(float *));
356 #endif
357  for (inet = 0; inet < num_timing_nets; inet++) {
358  slacks->slack[inet] = (float *) my_chunk_malloc((timing_nets[inet].num_sinks + 1) * sizeof(float), &tedge_ch);
359  slacks->timing_criticality[inet] = (float *) my_chunk_malloc((timing_nets[inet].num_sinks + 1) * sizeof(float), &tedge_ch);
360 #ifdef PATH_COUNTING
361  slacks->path_criticality[inet] = (float *) my_chunk_malloc((timing_nets[inet].num_sinks + 1) * sizeof(float), &tedge_ch);
362 #endif
363  }
364 
365  return slacks;
366 }
367 
369 
370  /* Sets the delays of the inter-CLB nets to the values specified by *
371  * net_delay[0..num_nets-1][1..num_pins-1]. These net delays should have *
372  * been allocated and loaded with the net_delay routines. This routine *
373  * marks the corresponding edges in the timing graph with the proper delay. */
374 
375  int inet, ipin, inode;
376  t_tedge *tedge;
377 
378  for (inet = 0; inet < num_timing_nets; inet++) {
379  inode = f_net_to_driver_tnode[inet];
380  tedge = tnode[inode].out_edges;
381 
382  /* Note that the edges of a tnode corresponding to a CLB or INPAD opin must *
383  * be in the same order as the pins of the net driven by the tnode. */
384 
385  for (ipin = 1; ipin < (timing_nets[inet].num_sinks + 1); ipin++)
386  tedge[ipin - 1].Tdel = net_delay[inet][ipin];
387  }
388 }
389 
390 void free_timing_graph(t_slack * slacks) {
391 
392  int inode;
393 
394  if (tedge_ch.chunk_ptr_head == NULL) {
395  vpr_printf(TIO_MESSAGE_ERROR, "in free_timing_graph: No timing graph to free.\n");
396  exit(1);
397  }
398 
399  free_chunk_memory(&tedge_ch);
400 
401  if (tnode[0].prepacked_data) {
402  /* If we allocated prepacked_data for the first node,
403  it must be allocated for all other nodes too. */
404  for (inode = 0; inode < num_tnodes; inode++) {
405  free(tnode[inode].prepacked_data);
406  }
407  }
408  free(tnode);
409  free(f_net_to_driver_tnode);
411 
412  free(slacks->slack);
413  free(slacks->timing_criticality);
414 #ifdef PATH_COUNTING
415  free(slacks->path_criticality);
416 #endif
417  free(slacks);
418 
419  tnode = NULL;
420  num_tnodes = 0;
421  f_net_to_driver_tnode = NULL;
422  tnodes_at_level = NULL;
423  num_tnode_levels = 0;
424  slacks = NULL;
425 }
426 
427 void free_timing_stats(void) {
428  int i;
429  if(f_timing_stats != NULL) {
430  for (i = 0; i < g_sdc->num_constrained_clocks; i++) {
431  free(f_timing_stats->cpd[i]);
432  free(f_timing_stats->least_slack[i]);
433  }
434  free(f_timing_stats->cpd);
435  free(f_timing_stats->least_slack);
436  free(f_timing_stats);
437  }
438  f_timing_stats = NULL;
439 }
440 
441 void print_slack(float ** slack, boolean slack_is_normalized, const char *fname) {
442 
443  /* Prints slacks into a file. */
444 
445  int inet, iedge, ibucket, driver_tnode, num_edges, num_unused_slacks = 0;
446  t_tedge * tedge;
447  FILE *fp;
448  float max_slack = HUGE_NEGATIVE_FLOAT, min_slack = HUGE_POSITIVE_FLOAT,
449  total_slack = 0, total_negative_slack = 0, bucket_size, slk;
450  int slacks_in_bucket[NUM_BUCKETS];
451 
452  fp = my_fopen(fname, "w", 0);
453 
454  if (slack_is_normalized) {
455  fprintf(fp, "The following slacks have been normalized to be non-negative by "
456  "relaxing the required times to the maximum arrival time.\n\n");
457  }
458 
459  /* Go through slack once to get the largest and smallest slack, both for reporting and
460  so that we can delimit the buckets. Also calculate the total negative slack in the design. */
461  for (inet = 0; inet < num_timing_nets; inet++) {
462  num_edges = timing_nets[inet].num_sinks;
463  for (iedge = 0; iedge < num_edges; iedge++) {
464  slk = slack[inet][iedge + 1];
465  if (slk < HUGE_POSITIVE_FLOAT - 1) { /* if slack was analysed */
466  max_slack = std::max(max_slack, slk);
467  min_slack = std::min(min_slack, slk);
468  total_slack += slk;
469  if (slk < NEGATIVE_EPSILON) {
470  total_negative_slack -= slk; /* By convention, we'll have total_negative_slack be a positive number. */
471  }
472  } else { /* slack was never analysed */
473  num_unused_slacks++;
474  }
475  }
476  }
477 
478  if (max_slack > HUGE_NEGATIVE_FLOAT + 1) {
479  fprintf(fp, "Largest slack in design: %g\n", max_slack);
480  } else {
481  fprintf(fp, "Largest slack in design: --\n");
482  }
483 
484  if (min_slack < HUGE_POSITIVE_FLOAT - 1) {
485  fprintf(fp, "Smallest slack in design: %g\n", min_slack);
486  } else {
487  fprintf(fp, "Smallest slack in design: --\n");
488  }
489 
490  fprintf(fp, "Total slack in design: %g\n", total_slack);
491  fprintf(fp, "Total negative slack: %g\n", total_negative_slack);
492 
493  if (max_slack - min_slack > EPSILON) { /* Only sort the slacks into buckets if not all slacks are the same (if they are identical, no need to sort). */
494  /* Initialize slacks_in_bucket, an array counting how many slacks are within certain linearly-spaced ranges (buckets). */
495  for (ibucket = 0; ibucket < NUM_BUCKETS; ibucket++) {
496  slacks_in_bucket[ibucket] = 0;
497  }
498 
499  /* The size of each bucket is the range of slacks, divided by the number of buckets. */
500  bucket_size = (max_slack - min_slack)/NUM_BUCKETS;
501 
502  /* Now, pass through again, incrementing the number of slacks in the nth bucket
503  for each slack between (min_slack + n*bucket_size) and (min_slack + (n+1)*bucket_size). */
504 
505  for (inet = 0; inet < num_timing_nets; inet++) {
506  num_edges = timing_nets[inet].num_sinks;
507  for (iedge = 0; iedge < num_edges; iedge++) {
508  slk = slack[inet][iedge + 1];
509  if (slk < HUGE_POSITIVE_FLOAT - 1) {
510  /* We have to watch out for the special case where slack = max_slack, in which case ibucket = NUM_BUCKETS and we go out of bounds of the array. */
511  ibucket = std::min(NUM_BUCKETS - 1, (int) ((slk - min_slack)/bucket_size));
512  assert(ibucket >= 0 && ibucket < NUM_BUCKETS);
513  slacks_in_bucket[ibucket]++;
514  }
515  }
516  }
517 
518  /* Now print how many slacks are in each bucket. */
519  fprintf(fp, "\n\nRange\t\t");
520  for (ibucket = 0; ibucket < NUM_BUCKETS; ibucket++) {
521  fprintf(fp, "%.1e to ", min_slack);
522  min_slack += bucket_size;
523  fprintf(fp, "%.1e\t", min_slack);
524  }
525  fprintf(fp, "Not analysed");
526  fprintf(fp, "\nSlacks in range\t\t");
527  for (ibucket = 0; ibucket < NUM_BUCKETS; ibucket++) {
528  fprintf(fp, "%d\t\t\t", slacks_in_bucket[ibucket]);
529  }
530  fprintf(fp, "%d", num_unused_slacks);
531  }
532 
533  /* Finally, print all the slacks, organized by net. */
534  fprintf(fp, "\n\nNet #\tDriver_tnode\tto_node\tSlack\n\n");
535 
536  for (inet = 0; inet < num_timing_nets; inet++) {
537  driver_tnode = f_net_to_driver_tnode[inet];
538  num_edges = tnode[driver_tnode].num_edges;
539  tedge = tnode[driver_tnode].out_edges;
540  slk = slack[inet][1];
541  if (slk < HUGE_POSITIVE_FLOAT - 1) {
542  fprintf(fp, "%5d\t%5d\t\t%5d\t%g\n", inet, driver_tnode, tedge[0].to_node, slk);
543  } else { /* Slack is meaningless, so replace with --. */
544  fprintf(fp, "%5d\t%5d\t\t%5d\t--\n", inet, driver_tnode, tedge[0].to_node);
545  }
546  for (iedge = 1; iedge < num_edges; iedge++) { /* newline and indent subsequent edges after the first */
547  slk = slack[inet][iedge+1];
548  if (slk < HUGE_POSITIVE_FLOAT - 1) {
549  fprintf(fp, "\t\t\t%5d\t%g\n", tedge[iedge].to_node, slk);
550  } else { /* Slack is meaningless, so replace with --. */
551  fprintf(fp, "\t\t\t%5d\t--\n", tedge[iedge].to_node);
552  }
553  }
554  }
555 
556  fclose(fp);
557 }
558 
559 void print_criticality(t_slack * slacks, boolean criticality_is_normalized, const char *fname) {
560 
561  /* Prints timing criticalities (and path criticalities if enabled) into a file. */
562 
563  int inet, iedge, driver_tnode, num_edges;
564  t_tedge * tedge;
565  FILE *fp;
566 
567  fp = my_fopen(fname, "w", 0);
568 
569  if (criticality_is_normalized) {
570  fprintf(fp, "Timing criticalities have been normalized to be non-negative by "
571  "relaxing the required times to the maximum arrival time.\n\n");
572  }
573 
574  print_global_criticality_stats(fp, slacks->timing_criticality, "timing criticality", "Timing criticalities");
575 #ifdef PATH_COUNTING
576  print_global_criticality_stats(fp, slacks->path_criticality, "path criticality", "Path criticalities");
577 #endif
578 
579  /* Finally, print all the criticalities, organized by net. */
580  fprintf(fp, "\n\nNet #\tDriver_tnode\t to_node\tTiming criticality"
581 #ifdef PATH_COUNTING
582  "\tPath criticality"
583 #endif
584  "\n");
585 
586  for (inet = 0; inet < num_timing_nets; inet++) {
587  driver_tnode = f_net_to_driver_tnode[inet];
588  num_edges = tnode[driver_tnode].num_edges;
589  tedge = tnode[driver_tnode].out_edges;
590 
591  fprintf(fp, "\n%5d\t%5d\t\t%5d\t\t%.6f", inet, driver_tnode, tedge[0].to_node, slacks->timing_criticality[inet][1]);
592 #ifdef PATH_COUNTING
593  fprintf(fp, "\t\t%g", slacks->path_criticality[inet][1]);
594 #endif
595  for (iedge = 1; iedge < num_edges; iedge++) { /* newline and indent subsequent edges after the first */
596  fprintf(fp, "\n\t\t\t%5d\t\t%.6f", tedge[iedge].to_node, slacks->timing_criticality[inet][iedge+1]);
597 #ifdef PATH_COUNTING
598  fprintf(fp, "\t\t%g", slacks->path_criticality[inet][iedge+1]);
599 #endif
600  }
601  }
602 
603  fclose(fp);
604 }
605 
606 static void print_global_criticality_stats(FILE * fp, float ** criticality, const char * singular_name, const char * capitalized_plural_name) {
607 
608  /* Prints global stats for timing or path criticality to the file pointed to by fp,
609  including maximum criticality, minimum criticality, total criticality in the design,
610  and the number of criticalities within various ranges, or buckets. */
611 
612  int inet, iedge, num_edges, ibucket, criticalities_in_bucket[NUM_BUCKETS];
613  float crit, max_criticality = HUGE_NEGATIVE_FLOAT, min_criticality = HUGE_POSITIVE_FLOAT,
614  total_criticality = 0, bucket_size;
615 
616  /* Go through criticality once to get the largest and smallest timing criticality,
617  both for reporting and so that we can delimit the buckets. */
618  for (inet = 0; inet < num_timing_nets; inet++) {
619  num_edges = timing_nets[inet].num_sinks;
620  for (iedge = 0; iedge < num_edges; iedge++) {
621  crit = criticality[inet][iedge + 1];
622  max_criticality = std::max(max_criticality, crit);
623  min_criticality = std::min(min_criticality, crit);
624  total_criticality += crit;
625  }
626  }
627 
628  fprintf(fp, "Largest %s in design: %g\n", singular_name, max_criticality);
629  fprintf(fp, "Smallest %s in design: %g\n", singular_name, min_criticality);
630  fprintf(fp, "Total %s in design: %g\n", singular_name, total_criticality);
631 
632  if (max_criticality - min_criticality > EPSILON) { /* Only sort the criticalities into buckets if not all criticalities are the same (if they are identical, no need to sort). */
633  /* Initialize criticalities_in_bucket, an array counting how many criticalities are within certain linearly-spaced ranges (buckets). */
634  for (ibucket = 0; ibucket < NUM_BUCKETS; ibucket++) {
635  criticalities_in_bucket[ibucket] = 0;
636  }
637 
638  /* The size of each bucket is the range of criticalities, divided by the number of buckets. */
639  bucket_size = (max_criticality - min_criticality)/NUM_BUCKETS;
640 
641  /* Now, pass through again, incrementing the number of criticalities in the nth bucket
642  for each criticality between (min_criticality + n*bucket_size) and (min_criticality + (n+1)*bucket_size). */
643 
644  for (inet = 0; inet < num_timing_nets; inet++) {
645  num_edges = timing_nets[inet].num_sinks;
646  for (iedge = 0; iedge < num_edges; iedge++) {
647  crit = criticality[inet][iedge + 1];
648  /* We have to watch out for the special case where criticality = max_criticality, in which case ibucket = NUM_BUCKETS and we go out of bounds of the array. */
649  ibucket = std::min(NUM_BUCKETS - 1, (int) ((crit - min_criticality)/bucket_size));
650  assert(ibucket >= 0 && ibucket < NUM_BUCKETS);
651  criticalities_in_bucket[ibucket]++;
652  }
653  }
654 
655  /* Now print how many criticalities are in each bucket. */
656  fprintf(fp, "\nRange\t\t");
657  for (ibucket = 0; ibucket < NUM_BUCKETS; ibucket++) {
658  fprintf(fp, "%.1e to ", min_criticality);
659  min_criticality += bucket_size;
660  fprintf(fp, "%.1e\t", min_criticality);
661  }
662  fprintf(fp, "\n%s in range\t\t", capitalized_plural_name);
663  for (ibucket = 0; ibucket < NUM_BUCKETS; ibucket++) {
664  fprintf(fp, "%d\t\t\t", criticalities_in_bucket[ibucket]);
665  }
666  }
667  fprintf(fp, "\n\n");
668 }
669 
670 void print_net_delay(float **net_delay, const char *fname) {
671 
672  /* Prints the net delays into a file. */
673 
674  int inet, iedge, driver_tnode, num_edges;
675  t_tedge * tedge;
676  FILE *fp;
677 
678  fp = my_fopen(fname, "w", 0);
679 
680  fprintf(fp, "Net #\tDriver_tnode\tto_node\tDelay\n\n");
681 
682  for (inet = 0; inet < num_timing_nets; inet++) {
683  driver_tnode = f_net_to_driver_tnode[inet];
684  num_edges = tnode[driver_tnode].num_edges;
685  tedge = tnode[driver_tnode].out_edges;
686  fprintf(fp, "%5d\t%5d\t\t%5d\t%g\n", inet, driver_tnode, tedge[0].to_node, net_delay[inet][1]);
687  for (iedge = 1; iedge < num_edges; iedge++) { /* newline and indent subsequent edges after the first */
688  fprintf(fp, "\t\t\t%5d\t%g\n", tedge[iedge].to_node, net_delay[inet][iedge+1]);
689  }
690  }
691 
692  fclose(fp);
693 }
694 
695 #ifndef PATH_COUNTING
696 void print_clustering_timing_info(const char *fname) {
697  /* Print information from tnodes which is used by the clusterer. */
698  int inode;
699  FILE *fp;
700 
701  fp = my_fopen(fname, "w", 0);
702 
703  fprintf(fp, "inode ");
704  if (g_sdc->num_constrained_clocks <= 1) {
705  /* These values are from the last constraint analysed,
706  so they're not meaningful unless there was only one constraint. */
707  fprintf(fp, "Critical input paths Critical output paths ");
708  }
709  fprintf(fp, "Normalized slack Normalized Tarr Normalized total crit paths\n");
710  for (inode = 0; inode < num_tnodes; inode++) {
711  fprintf(fp, "%d\t", inode);
712  /* Only print normalized values for tnodes which have valid normalized values. (If normalized_T_arr is valid, the others will be too.) */
713  if (has_valid_normalized_T_arr(inode)) {
714  if (g_sdc->num_constrained_clocks <= 1) {
715  fprintf(fp, "%ld\t\t\t%ld\t\t\t", tnode[inode].prepacked_data->num_critical_input_paths, tnode[inode].prepacked_data->num_critical_output_paths);
716  }
717  fprintf(fp, "%f\t%f\t%f\n", tnode[inode].prepacked_data->normalized_slack, tnode[inode].prepacked_data->normalized_T_arr, tnode[inode].prepacked_data->normalized_total_critical_paths);
718  } else {
719  if (g_sdc->num_constrained_clocks <= 1) {
720  fprintf(fp, "--\t\t\t--\t\t\t");
721  }
722  fprintf(fp, "--\t\t--\t\t--\n");
723  }
724  }
725 
726  fclose(fp);
727 }
728 #endif
729 /* Count # of tnodes, allocates space, and loads the tnodes and its associated edges */
730 static void alloc_and_load_tnodes(t_timing_inf timing_inf) {
731  int i, j, k;
732  int inode;
733  int num_nodes_in_block;
734  int count;
735  int iblock, irr_node;
736  int inet, dport, dpin, dblock, dnode;
737  int normalized_pin, normalization;
738  t_pb_graph_pin *ipb_graph_pin;
739  t_rr_node *local_rr_graph, *d_rr_graph;
740  int num_dangling_pins;
741 
742  f_net_to_driver_tnode = (int*)my_malloc(num_timing_nets * sizeof(int));
743 
744  for (i = 0; i < num_timing_nets; i++) {
746  }
747 
748  /* allocate space for tnodes */
749  num_tnodes = 0;
750  for (i = 0; i < num_blocks; i++) {
751  num_nodes_in_block = 0;
752  for (j = 0; j < block[i].pb->pb_graph_node->total_pb_pins; j++) {
753  if (block[i].pb->rr_graph[j].net_num != OPEN) {
754  if (block[i].pb->rr_graph[j].pb_graph_pin->type == PB_PIN_INPAD
755  || block[i].pb->rr_graph[j].pb_graph_pin->type
756  == PB_PIN_OUTPAD
757  || block[i].pb->rr_graph[j].pb_graph_pin->type
758  == PB_PIN_SEQUENTIAL) {
759  num_nodes_in_block += 2;
760  } else {
761  num_nodes_in_block++;
762  }
763  }
764  }
765  num_tnodes += num_nodes_in_block;
766  }
767  tnode = (t_tnode*)my_calloc(num_tnodes, sizeof(t_tnode));
768 
769  /* load tnodes with all info except edge info */
770  /* populate tnode lookups for edge info */
771  inode = 0;
772  for (i = 0; i < num_blocks; i++) {
773  for (j = 0; j < block[i].pb->pb_graph_node->total_pb_pins; j++) {
774  if (block[i].pb->rr_graph[j].net_num != OPEN) {
775  assert(tnode[inode].pb_graph_pin == NULL);
776  load_tnode(block[i].pb->rr_graph[j].pb_graph_pin, i, &inode,
777  timing_inf);
778  }
779  }
780  }
781  assert(inode == num_tnodes);
782  num_dangling_pins = 0;
783 
784  /* load edge delays and initialize clock domains to OPEN
785  and prepacked_data (which is not used post-packing) to NULL. */
786  for (i = 0; i < num_tnodes; i++) {
787  tnode[i].clock_domain = OPEN;
788  tnode[i].prepacked_data = NULL;
789 
790  /* 3 primary scenarios for edge delays
791  1. Point-to-point delays inside block
792  2.
793  */
794  count = 0;
795  iblock = tnode[i].block;
796  switch (tnode[i].type) {
797  case TN_INPAD_OPIN:
799  case TN_PRIMITIVE_OPIN:
800  case TN_FF_OPIN:
801  case TN_CB_IPIN:
802  /* fanout is determined by intra-cluster connections */
803  /* Allocate space for edges */
804  irr_node = tnode[i].pb_graph_pin->pin_count_in_cluster;
805  local_rr_graph = block[iblock].pb->rr_graph;
806  ipb_graph_pin = local_rr_graph[irr_node].pb_graph_pin;
807 
808  if (ipb_graph_pin->parent_node->pb_type->max_internal_delay
809  != UNDEFINED) {
812  ipb_graph_pin->parent_node->pb_type->max_internal_delay;
814  ipb_graph_pin->parent_node->pb_type;
815  } else if (pb_max_internal_delay
816  < ipb_graph_pin->parent_node->pb_type->max_internal_delay) {
818  ipb_graph_pin->parent_node->pb_type->max_internal_delay;
820  ipb_graph_pin->parent_node->pb_type;
821  }
822  }
823 
824  for (j = 0; j < block[iblock].pb->rr_graph[irr_node].num_edges;
825  j++) {
826  dnode = local_rr_graph[irr_node].edges[j];
827  if ((local_rr_graph[dnode].prev_node == irr_node)
828  && (j == local_rr_graph[dnode].prev_edge)) {
829  count++;
830  }
831  }
832  assert(count > 0);
833  tnode[i].num_edges = count;
834  tnode[i].out_edges = (t_tedge *) my_chunk_malloc(
835  count * sizeof(t_tedge), &tedge_ch);
836 
837  /* Load edges */
838  count = 0;
839  for (j = 0; j < local_rr_graph[irr_node].num_edges; j++) {
840  dnode = local_rr_graph[irr_node].edges[j];
841  if ((local_rr_graph[dnode].prev_node == irr_node)
842  && (j == local_rr_graph[dnode].prev_edge)) {
843  assert(
844  (ipb_graph_pin->output_edges[j]->num_output_pins == 1) && (local_rr_graph[ipb_graph_pin->output_edges[j]->output_pins[0]->pin_count_in_cluster].net_num == local_rr_graph[irr_node].net_num));
845 
846  tnode[i].out_edges[count].Tdel =
847  ipb_graph_pin->output_edges[j]->delay_max;
848  tnode[i].out_edges[count].to_node =
849  get_tnode_index(local_rr_graph[dnode].tnode);
850 
851  if (vpack_net[local_rr_graph[irr_node].net_num].is_const_gen
852  == TRUE && tnode[i].type == TN_PRIMITIVE_OPIN) {
853  tnode[i].out_edges[count].Tdel = HUGE_NEGATIVE_FLOAT;
854  tnode[i].type = TN_CONSTANT_GEN_SOURCE;
855  }
856 
857  count++;
858  }
859  }
860  assert(count > 0);
861 
862  break;
863  case TN_PRIMITIVE_IPIN:
864  /* Pin info comes from pb_graph block delays
865  */
866  /*there would be no slack information if timing analysis is off*/
867  if (timing_inf.timing_analysis_enabled)
868  {
869  irr_node = tnode[i].pb_graph_pin->pin_count_in_cluster;
870  local_rr_graph = block[iblock].pb->rr_graph;
871  ipb_graph_pin = local_rr_graph[irr_node].pb_graph_pin;
872  tnode[i].num_edges = ipb_graph_pin->num_pin_timing;
873  tnode[i].out_edges = (t_tedge *) my_chunk_malloc(
874  ipb_graph_pin->num_pin_timing * sizeof(t_tedge),
875  &tedge_ch);
876  k = 0;
877 
878  for (j = 0; j < tnode[i].num_edges; j++) {
879  /* Some outpins aren't used, ignore these. Only consider output pins that are used */
880  if (local_rr_graph[ipb_graph_pin->pin_timing[j]->pin_count_in_cluster].net_num
881  != OPEN) {
882  tnode[i].out_edges[k].Tdel =
883  ipb_graph_pin->pin_timing_del_max[j];
884  tnode[i].out_edges[k].to_node =
885  get_tnode_index(local_rr_graph[ipb_graph_pin->pin_timing[j]->pin_count_in_cluster].tnode);
886  assert(tnode[i].out_edges[k].to_node != OPEN);
887  k++;
888  }
889  }
890  tnode[i].num_edges -= (j - k); /* remove unused edges */
891  if (tnode[i].num_edges == 0) {
892  /* Dangling pin */
893  num_dangling_pins++;
894  }
895  }
896  break;
897  case TN_CB_OPIN:
898  /* load up net info */
899  irr_node = tnode[i].pb_graph_pin->pin_count_in_cluster;
900  local_rr_graph = block[iblock].pb->rr_graph;
901  ipb_graph_pin = local_rr_graph[irr_node].pb_graph_pin;
902  assert(local_rr_graph[irr_node].net_num != OPEN);
903  inet = vpack_to_clb_net_mapping[local_rr_graph[irr_node].net_num];
904  assert(inet != OPEN);
905  f_net_to_driver_tnode[inet] = i;
906  tnode[i].num_edges = clb_net[inet].num_sinks;
907  tnode[i].out_edges = (t_tedge *) my_chunk_malloc(
908  clb_net[inet].num_sinks * sizeof(t_tedge),
909  &tedge_ch);
910  for (j = 1; j <= clb_net[inet].num_sinks; j++) {
911  dblock = clb_net[inet].node_block[j];
912  normalization = block[dblock].type->num_pins
913  / block[dblock].type->capacity;
914  normalized_pin = clb_net[inet].node_block_pin[j]
915  % normalization;
916  d_rr_graph = block[dblock].pb->rr_graph;
917  dpin = OPEN;
918  dport = OPEN;
919  count = 0;
920 
921  for (k = 0;
922  k < block[dblock].pb->pb_graph_node->num_input_ports
923  && dpin == OPEN; k++) {
924  if (normalized_pin >= count
925  && (count
926  + block[dblock].pb->pb_graph_node->num_input_pins[k]
927  > normalized_pin)) {
928  dpin = normalized_pin - count;
929  dport = k;
930  break;
931  }
932  count += block[dblock].pb->pb_graph_node->num_input_pins[k];
933  }
934  if (dpin == OPEN) {
935  for (k = 0;
936  k
938  && dpin == OPEN; k++) {
939  count +=
940  block[dblock].pb->pb_graph_node->num_output_pins[k];
941  }
942  for (k = 0;
943  k < block[dblock].pb->pb_graph_node->num_clock_ports
944  && dpin == OPEN; k++) {
945  if (normalized_pin >= count
946  && (count
947  + block[dblock].pb->pb_graph_node->num_clock_pins[k]
948  > normalized_pin)) {
949  dpin = normalized_pin - count;
950  dport = k;
951  }
952  count +=
953  block[dblock].pb->pb_graph_node->num_clock_pins[k];
954  }
955  assert(dpin != OPEN);
956  assert(
957  inet == vpack_to_clb_net_mapping[d_rr_graph[block[dblock].pb->pb_graph_node->clock_pins[dport][dpin].pin_count_in_cluster].net_num]);
958  tnode[i].out_edges[j - 1].to_node =
959  get_tnode_index(d_rr_graph[block[dblock].pb->pb_graph_node->clock_pins[dport][dpin].pin_count_in_cluster].tnode);
960  } else {
961  assert(dpin != OPEN);
962  assert(
963  inet == vpack_to_clb_net_mapping[d_rr_graph[block[dblock].pb->pb_graph_node->input_pins[dport][dpin].pin_count_in_cluster].net_num]);
964  /* delays are assigned post routing */
965  tnode[i].out_edges[j - 1].to_node =
966  get_tnode_index(d_rr_graph[block[dblock].pb->pb_graph_node->input_pins[dport][dpin].pin_count_in_cluster].tnode);
967  }
968  tnode[i].out_edges[j - 1].Tdel = 0;
969  assert(inet != OPEN);
970  }
971  break;
972  case TN_OUTPAD_IPIN:
973  case TN_INPAD_SOURCE:
974  case TN_OUTPAD_SINK:
975  case TN_FF_SINK:
976  case TN_FF_SOURCE:
977  case TN_FF_IPIN:
978  case TN_FF_CLOCK:
979  break;
980  default:
981  vpr_printf(TIO_MESSAGE_ERROR, "Consistency check failed: Unknown tnode type %d.\n", tnode[i].type);
982  assert(0);
983  break;
984  }
985  }
986  if(num_dangling_pins > 0) {
987  vpr_printf(TIO_MESSAGE_WARNING, "Unconnected logic in design, number of dangling tnodes = %d\n", num_dangling_pins);
988  }
989 }
990 
991 /* Allocate timing graph for pre packed netlist
992  Count number of tnodes first
993  Then connect up tnodes with edges
994  */
995 static void alloc_and_load_tnodes_from_prepacked_netlist(float block_delay,
996  float inter_cluster_net_delay) {
997  int i, j, k;
998  t_model *model;
999  t_model_ports *model_port;
1000  t_pb_graph_pin *from_pb_graph_pin, *to_pb_graph_pin;
1001  int inode, inet;
1002  int incr;
1003  int count;
1004 
1005  f_net_to_driver_tnode = (int*)my_malloc(num_logical_nets * sizeof(int));
1006 
1007  for (i = 0; i < num_logical_nets; i++) {
1009  }
1010 
1011  /* allocate space for tnodes */
1012  num_tnodes = 0;
1013  for (i = 0; i < num_logical_blocks; i++) {
1014  model = logical_block[i].model;
1015  logical_block[i].clock_net_tnode = NULL;
1016  if (logical_block[i].type == VPACK_INPAD) {
1018  sizeof(t_tnode**));
1019  num_tnodes += 2;
1020  } else if (logical_block[i].type == VPACK_OUTPAD) {
1021  logical_block[i].input_net_tnodes = (t_tnode***)my_calloc(1, sizeof(t_tnode**));
1022  num_tnodes += 2;
1023  } else {
1024  if (logical_block[i].clock_net == OPEN) {
1025  incr = 1;
1026  } else {
1027  incr = 2;
1028  }
1029  j = 0;
1030  model_port = model->inputs;
1031  while (model_port) {
1032  if (model_port->is_clock == FALSE) {
1033  for (k = 0; k < model_port->size; k++) {
1034  if (logical_block[i].input_nets[j][k] != OPEN) {
1035  num_tnodes += incr;
1036  }
1037  }
1038  j++;
1039  } else {
1040  num_tnodes++;
1041  }
1042  model_port = model_port->next;
1043  }
1044  logical_block[i].input_net_tnodes = (t_tnode ***)my_calloc(j, sizeof(t_tnode**));
1045 
1046  j = 0;
1047  model_port = model->outputs;
1048  while (model_port) {
1049  for (k = 0; k < model_port->size; k++) {
1050  if (logical_block[i].output_nets[j][k] != OPEN) {
1051  num_tnodes += incr;
1052  }
1053  }
1054  j++;
1055  model_port = model_port->next;
1056  }
1058  sizeof(t_tnode**));
1059  }
1060  }
1061  tnode = (t_tnode *)my_calloc(num_tnodes, sizeof(t_tnode));
1062 
1063  /* Allocate space for prepacked_data, which is only used pre-packing. */
1064  for (inode = 0; inode < num_tnodes; inode++) {
1066  }
1067 
1068  /* load tnodes, alloc edges for tnodes, load all known tnodes */
1069  inode = 0;
1070  for (i = 0; i < num_logical_blocks; i++) {
1071  model = logical_block[i].model;
1072  if (logical_block[i].type == VPACK_INPAD) {
1074  sizeof(t_tnode*));
1075  logical_block[i].output_net_tnodes[0][0] = &tnode[inode];
1077  tnode[inode].prepacked_data->model_pin = 0;
1078  tnode[inode].prepacked_data->model_port = 0;
1079  tnode[inode].prepacked_data->model_port_ptr = model->outputs;
1080  tnode[inode].block = i;
1081  tnode[inode].type = TN_INPAD_OPIN;
1082 
1083  tnode[inode].num_edges =
1085  tnode[inode].out_edges = (t_tedge *) my_chunk_malloc(
1086  tnode[inode].num_edges * sizeof(t_tedge),
1087  &tedge_ch);
1088  tnode[inode + 1].num_edges = 1;
1089  tnode[inode + 1].out_edges = (t_tedge *) my_chunk_malloc(
1090  1 * sizeof(t_tedge), &tedge_ch);
1091  tnode[inode + 1].out_edges->Tdel = 0;
1092  tnode[inode + 1].out_edges->to_node = inode;
1093  tnode[inode + 1].type = TN_INPAD_SOURCE;
1094  tnode[inode + 1].block = i;
1095  inode += 2;
1096  } else if (logical_block[i].type == VPACK_OUTPAD) {
1098  sizeof(t_tnode*));
1099  logical_block[i].input_net_tnodes[0][0] = &tnode[inode];
1100  tnode[inode].prepacked_data->model_pin = 0;
1101  tnode[inode].prepacked_data->model_port = 0;
1102  tnode[inode].prepacked_data->model_port_ptr = model->inputs;
1103  tnode[inode].block = i;
1104  tnode[inode].type = TN_OUTPAD_IPIN;
1105  tnode[inode].num_edges = 1;
1106  tnode[inode].out_edges = (t_tedge *) my_chunk_malloc(
1107  1 * sizeof(t_tedge), &tedge_ch);
1108  tnode[inode].out_edges->Tdel = 0;
1109  tnode[inode].out_edges->to_node = inode + 1;
1110  tnode[inode + 1].type = TN_OUTPAD_SINK;
1111  tnode[inode + 1].block = i;
1112  tnode[inode + 1].num_edges = 0;
1113  tnode[inode + 1].out_edges = NULL;
1114  inode += 2;
1115  } else {
1116  j = 0;
1117  model_port = model->outputs;
1118  while (model_port) {
1120  model_port->size, sizeof(t_tnode*));
1121  for (k = 0; k < model_port->size; k++) {
1122  if (logical_block[i].output_nets[j][k] != OPEN) {
1123  tnode[inode].prepacked_data->model_pin = k;
1124  tnode[inode].prepacked_data->model_port = j;
1125  tnode[inode].prepacked_data->model_port_ptr = model_port;
1126  tnode[inode].block = i;
1128  inode;
1129  logical_block[i].output_net_tnodes[j][k] =
1130  &tnode[inode];
1131 
1132  tnode[inode].num_edges =
1134  tnode[inode].out_edges = (t_tedge *) my_chunk_malloc(
1135  tnode[inode].num_edges * sizeof(t_tedge),
1136  &tedge_ch);
1137 
1138  if (logical_block[i].clock_net == OPEN) {
1139  tnode[inode].type = TN_PRIMITIVE_OPIN;
1140  inode++;
1141  } else {
1142  /* load delays from predicted clock-to-Q time */
1143  from_pb_graph_pin = get_pb_graph_node_pin_from_model_port_pin(model_port, k, logical_block[i].expected_lowest_cost_primitive);
1144  tnode[inode].type = TN_FF_OPIN;
1145  tnode[inode + 1].num_edges = 1;
1146  tnode[inode + 1].out_edges =
1148  1 * sizeof(t_tedge),
1149  &tedge_ch);
1150  tnode[inode + 1].out_edges->to_node = inode;
1151  tnode[inode + 1].out_edges->Tdel = from_pb_graph_pin->tsu_tco;
1152  tnode[inode + 1].type = TN_FF_SOURCE;
1153  tnode[inode + 1].block = i;
1154  inode += 2;
1155  }
1156  }
1157  }
1158  j++;
1159  model_port = model_port->next;
1160  }
1161 
1162  j = 0;
1163  model_port = model->inputs;
1164  while (model_port) {
1165  if (model_port->is_clock == FALSE) {
1167  model_port->size, sizeof(t_tnode*));
1168  for (k = 0; k < model_port->size; k++) {
1169  if (logical_block[i].input_nets[j][k] != OPEN) {
1170  tnode[inode].prepacked_data->model_pin = k;
1171  tnode[inode].prepacked_data->model_port = j;
1172  tnode[inode].prepacked_data->model_port_ptr = model_port;
1173  tnode[inode].block = i;
1174  logical_block[i].input_net_tnodes[j][k] =
1175  &tnode[inode];
1176  from_pb_graph_pin = get_pb_graph_node_pin_from_model_port_pin(model_port, k, logical_block[i].expected_lowest_cost_primitive);
1177  if (logical_block[i].clock_net == OPEN) {
1178  /* load predicted combinational delays to predicted edges */
1179  tnode[inode].type = TN_PRIMITIVE_IPIN;
1180  tnode[inode].out_edges =
1182  from_pb_graph_pin->num_pin_timing * sizeof(t_tedge),
1183  &tedge_ch);
1184  count = 0;
1185  for(int m = 0; m < from_pb_graph_pin->num_pin_timing; m++) {
1186  to_pb_graph_pin = from_pb_graph_pin->pin_timing[m];
1187  if(logical_block[i].output_nets[to_pb_graph_pin->port->model_port->index][to_pb_graph_pin->pin_number] == OPEN) {
1188  continue;
1189  }
1190  tnode[inode].out_edges[count].Tdel = from_pb_graph_pin->pin_timing_del_max[m];
1191  tnode[inode].out_edges[count].to_node =
1192  get_tnode_index(logical_block[i].output_net_tnodes[to_pb_graph_pin->port->model_port->index][to_pb_graph_pin->pin_number]);
1193  count++;
1194  }
1195  tnode[inode].num_edges = count;
1196  inode++;
1197  } else {
1198  /* load predicted setup time */
1199  tnode[inode].type = TN_FF_IPIN;
1200  tnode[inode].num_edges = 1;
1201  tnode[inode].out_edges =
1203  1 * sizeof(t_tedge),
1204  &tedge_ch);
1205  tnode[inode].out_edges->to_node = inode + 1;
1206  tnode[inode].out_edges->Tdel = from_pb_graph_pin->tsu_tco;
1207  tnode[inode + 1].type = TN_FF_SINK;
1208  tnode[inode + 1].num_edges = 0;
1209  tnode[inode + 1].out_edges = NULL;
1210  tnode[inode + 1].block = i;
1211  inode += 2;
1212  }
1213  }
1214  }
1215  j++;
1216  } else {
1217  if (logical_block[i].clock_net != OPEN) {
1218  assert(logical_block[i].clock_net_tnode == NULL);
1219  logical_block[i].clock_net_tnode = &tnode[inode];
1220  tnode[inode].block = i;
1221  tnode[inode].prepacked_data->model_pin = 0;
1222  tnode[inode].prepacked_data->model_port = 0;
1223  tnode[inode].prepacked_data->model_port_ptr = model_port;
1224  tnode[inode].num_edges = 0;
1225  tnode[inode].out_edges = NULL;
1226  tnode[inode].type = TN_FF_CLOCK;
1227  inode++;
1228  }
1229  }
1230  model_port = model_port->next;
1231  }
1232  }
1233  }
1234  assert(inode == num_tnodes);
1235 
1236  /* load edge delays and initialize clock domains to OPEN. */
1237  for (i = 0; i < num_tnodes; i++) {
1238  tnode[i].clock_domain = OPEN;
1239 
1240  /* 3 primary scenarios for edge delays
1241  1. Point-to-point delays inside block
1242  2.
1243  */
1244  count = 0;
1245  switch (tnode[i].type) {
1246  case TN_INPAD_OPIN:
1247  case TN_PRIMITIVE_OPIN:
1248  case TN_FF_OPIN:
1249  /* fanout is determined by intra-cluster connections */
1250  /* Allocate space for edges */
1251  inet =
1253  assert(inet != OPEN);
1254 
1255  for (j = 1; j <= vpack_net[inet].num_sinks; j++) {
1256  if (vpack_net[inet].is_const_gen) {
1257  tnode[i].out_edges[j - 1].Tdel = HUGE_NEGATIVE_FLOAT;
1258  tnode[i].type = TN_CONSTANT_GEN_SOURCE;
1259  } else {
1260  tnode[i].out_edges[j - 1].Tdel = inter_cluster_net_delay;
1261  }
1262  if (vpack_net[inet].is_global) {
1263  assert(
1264  logical_block[vpack_net[inet].node_block[j]].clock_net == inet);
1265  tnode[i].out_edges[j - 1].to_node =
1266  get_tnode_index(logical_block[vpack_net[inet].node_block[j]].clock_net_tnode);
1267  } else {
1268  assert(
1269  logical_block[vpack_net[inet].node_block[j]].input_net_tnodes[vpack_net[inet].node_block_port[j]][vpack_net[inet].node_block_pin[j]] != NULL);
1270  tnode[i].out_edges[j - 1].to_node =
1271  get_tnode_index(logical_block[vpack_net[inet].node_block[j]].input_net_tnodes[vpack_net[inet].node_block_port[j]][vpack_net[inet].node_block_pin[j]]);
1272  }
1273  }
1274  assert(tnode[i].num_edges == vpack_net[inet].num_sinks);
1275  break;
1276  case TN_PRIMITIVE_IPIN:
1277  case TN_OUTPAD_IPIN:
1278  case TN_INPAD_SOURCE:
1279  case TN_OUTPAD_SINK:
1280  case TN_FF_SINK:
1281  case TN_FF_SOURCE:
1282  case TN_FF_IPIN:
1283  case TN_FF_CLOCK:
1284  break;
1285  default:
1286  vpr_printf(TIO_MESSAGE_ERROR, "Consistency check failed: Unknown tnode type %d.\n", tnode[i].type);
1287  assert(0);
1288  break;
1289  }
1290  }
1291 
1292  for (i = 0; i < num_logical_nets; i++) {
1293  assert(f_net_to_driver_tnode[i] != OPEN);
1294  }
1295 }
1296 
1297 static void load_tnode(INP t_pb_graph_pin *pb_graph_pin, INP int iblock,
1298  INOUTP int *inode, INP t_timing_inf timing_inf) {
1299  int i;
1300  i = *inode;
1301  tnode[i].pb_graph_pin = pb_graph_pin;
1302  tnode[i].block = iblock;
1303  block[iblock].pb->rr_graph[pb_graph_pin->pin_count_in_cluster].tnode =
1304  &tnode[i];
1305  if (tnode[i].pb_graph_pin->parent_node->pb_type->blif_model == NULL) {
1306  assert(tnode[i].pb_graph_pin->type == PB_PIN_NORMAL);
1307  if (tnode[i].pb_graph_pin->parent_node->parent_pb_graph_node == NULL) {
1308  if (tnode[i].pb_graph_pin->port->type == IN_PORT) {
1309  tnode[i].type = TN_CB_IPIN;
1310  } else {
1311  assert(tnode[i].pb_graph_pin->port->type == OUT_PORT);
1312  tnode[i].type = TN_CB_OPIN;
1313  }
1314  } else {
1315  tnode[i].type = TN_INTERMEDIATE_NODE;
1316  }
1317  } else {
1318  if (tnode[i].pb_graph_pin->type == PB_PIN_INPAD) {
1319  assert(tnode[i].pb_graph_pin->port->type == OUT_PORT);
1320  tnode[i].type = TN_INPAD_OPIN;
1321  tnode[i + 1].num_edges = 1;
1322  tnode[i + 1].out_edges = (t_tedge *) my_chunk_malloc(
1323  1 * sizeof(t_tedge), &tedge_ch);
1324  tnode[i + 1].out_edges->Tdel = 0;
1325  tnode[i + 1].out_edges->to_node = i;
1326  tnode[i + 1].pb_graph_pin = pb_graph_pin; /* Necessary for propagate_clock_domain_and_skew(). */
1327  tnode[i + 1].type = TN_INPAD_SOURCE;
1328  tnode[i + 1].block = iblock;
1329  (*inode)++;
1330  } else if (tnode[i].pb_graph_pin->type == PB_PIN_OUTPAD) {
1331  assert(tnode[i].pb_graph_pin->port->type == IN_PORT);
1332  tnode[i].type = TN_OUTPAD_IPIN;
1333  tnode[i].num_edges = 1;
1334  tnode[i].out_edges = (t_tedge *) my_chunk_malloc(
1335  1 * sizeof(t_tedge), &tedge_ch);
1336  tnode[i].out_edges->Tdel = 0;
1337  tnode[i].out_edges->to_node = i + 1;
1338  tnode[i + 1].pb_graph_pin = pb_graph_pin; /* Necessary for find_tnode_net_name(). */
1339  tnode[i + 1].type = TN_OUTPAD_SINK;
1340  tnode[i + 1].block = iblock;
1341  tnode[i + 1].num_edges = 0;
1342  tnode[i + 1].out_edges = NULL;
1343  (*inode)++;
1344  } else if (tnode[i].pb_graph_pin->type == PB_PIN_SEQUENTIAL) {
1345  if (tnode[i].pb_graph_pin->port->type == IN_PORT) {
1346  tnode[i].type = TN_FF_IPIN;
1347  tnode[i].num_edges = 1;
1348  tnode[i].out_edges = (t_tedge *) my_chunk_malloc(
1349  1 * sizeof(t_tedge), &tedge_ch);
1350  tnode[i].out_edges->Tdel = pb_graph_pin->tsu_tco;
1351  tnode[i].out_edges->to_node = i + 1;
1352  tnode[i + 1].pb_graph_pin = pb_graph_pin;
1353  tnode[i + 1].type = TN_FF_SINK;
1354  tnode[i + 1].block = iblock;
1355  tnode[i + 1].num_edges = 0;
1356  tnode[i + 1].out_edges = NULL;
1357  } else {
1358  assert(tnode[i].pb_graph_pin->port->type == OUT_PORT);
1359  tnode[i].type = TN_FF_OPIN;
1360  tnode[i + 1].num_edges = 1;
1361  tnode[i + 1].out_edges = (t_tedge *) my_chunk_malloc(
1362  1 * sizeof(t_tedge), &tedge_ch);
1363  tnode[i + 1].out_edges->Tdel = pb_graph_pin->tsu_tco;
1364  tnode[i + 1].out_edges->to_node = i;
1365  tnode[i + 1].pb_graph_pin = pb_graph_pin;
1366  tnode[i + 1].type = TN_FF_SOURCE;
1367  tnode[i + 1].block = iblock;
1368  }
1369  (*inode)++;
1370  } else if (tnode[i].pb_graph_pin->type == PB_PIN_CLOCK) {
1371  tnode[i].type = TN_FF_CLOCK;
1372  tnode[i].num_edges = 0;
1373  tnode[i].out_edges = NULL;
1374  } else {
1375  if (tnode[i].pb_graph_pin->port->type == IN_PORT) {
1376  assert(tnode[i].pb_graph_pin->type == PB_PIN_TERMINAL);
1377  tnode[i].type = TN_PRIMITIVE_IPIN;
1378  } else {
1379  assert(tnode[i].pb_graph_pin->port->type == OUT_PORT);
1380  assert(tnode[i].pb_graph_pin->type == PB_PIN_TERMINAL);
1381  tnode[i].type = TN_PRIMITIVE_OPIN;
1382  }
1383  }
1384  }
1385  (*inode)++;
1386 }
1387 
1388 void print_timing_graph(const char *fname) {
1389 
1390  /* Prints the timing graph into a file. */
1391 
1392  FILE *fp;
1393  int inode, iedge, ilevel, i;
1394  t_tedge *tedge;
1395  e_tnode_type itype;
1396  const char *tnode_type_names[] = { "TN_INPAD_SOURCE", "TN_INPAD_OPIN", "TN_OUTPAD_IPIN",
1397 
1398  "TN_OUTPAD_SINK", "TN_CB_IPIN", "TN_CB_OPIN", "TN_INTERMEDIATE_NODE",
1399  "TN_PRIMITIVE_IPIN", "TN_PRIMITIVE_OPIN", "TN_FF_IPIN", "TN_FF_OPIN", "TN_FF_SINK",
1400  "TN_FF_SOURCE", "TN_FF_CLOCK", "TN_CONSTANT_GEN_SOURCE" };
1401 
1402  fp = my_fopen(fname, "w", 0);
1403 
1404  fprintf(fp, "num_tnodes: %d\n", num_tnodes);
1405  fprintf(fp, "Node #\tType\t\tipin\tiblk\tDomain\tSkew\tI/O Delay\t# edges\t"
1406  "to_node Tdel\n\n");
1407 
1408  for (inode = 0; inode < num_tnodes; inode++) {
1409  fprintf(fp, "%d\t", inode);
1410 
1411  itype = tnode[inode].type;
1412  fprintf(fp, "%-15.15s\t", tnode_type_names[itype]);
1413 
1414  if (tnode[inode].pb_graph_pin != NULL) {
1415  fprintf(fp, "%d\t%d\t",
1416  tnode[inode].pb_graph_pin->pin_count_in_cluster,
1417  tnode[inode].block);
1418  } else {
1419  fprintf(fp, "\t%d\t", tnode[inode].block);
1420  }
1421 
1422  if (itype == TN_FF_CLOCK || itype == TN_FF_SOURCE || itype == TN_FF_SINK) {
1423  fprintf(fp, "%d\t%.3e\t\t", tnode[inode].clock_domain, tnode[inode].clock_delay);
1424  } else if (itype == TN_INPAD_SOURCE) {
1425  fprintf(fp, "%d\t\t%.3e\t", tnode[inode].clock_domain, tnode[inode].out_edges[0].Tdel);
1426  } else if (itype == TN_OUTPAD_SINK) {
1427  assert(tnode[inode-1].type == TN_OUTPAD_IPIN); /* Outpad ipins should be one prior in the tnode array */
1428  fprintf(fp, "%d\t\t%.3e\t", tnode[inode].clock_domain, tnode[inode-1].out_edges[0].Tdel);
1429  } else {
1430  fprintf(fp, "\t\t\t\t");
1431  }
1432 
1433  fprintf(fp, "%d", tnode[inode].num_edges);
1434 
1435  /* Print all edges after edge 0 on separate lines */
1436  tedge = tnode[inode].out_edges;
1437  if (tnode[inode].num_edges > 0) {
1438  fprintf(fp, "\t%4d\t%7.3g", tedge[0].to_node, tedge[0].Tdel);
1439  for (iedge = 1; iedge < tnode[inode].num_edges; iedge++) {
1440  fprintf(fp, "\n\t\t\t\t\t\t\t\t\t\t%4d\t%7.3g", tedge[iedge].to_node, tedge[iedge].Tdel);
1441  }
1442  }
1443  fprintf(fp, "\n");
1444  }
1445 
1446  fprintf(fp, "\n\nnum_tnode_levels: %d\n", num_tnode_levels);
1447 
1448  for (ilevel = 0; ilevel < num_tnode_levels; ilevel++) {
1449  fprintf(fp, "\n\nLevel: %d Num_nodes: %d\nNodes:", ilevel,
1450  tnodes_at_level[ilevel].nelem);
1451  for (i = 0; i < tnodes_at_level[ilevel].nelem; i++)
1452  fprintf(fp, "\t%d", tnodes_at_level[ilevel].list[i]);
1453  }
1454 
1455  fprintf(fp, "\n");
1456  fprintf(fp, "\n\nNet #\tNet_to_driver_tnode\n");
1457 
1458  for (i = 0; i < num_nets; i++)
1459  fprintf(fp, "%4d\t%6d\n", i, f_net_to_driver_tnode[i]);
1460 
1461  if (g_sdc->num_constrained_clocks == 1) {
1462  /* Arrival and required times, and forward and backward weights, will be meaningless for multiclock
1463  designs, since the values currently on the graph will only correspond to the most recent traversal. */
1464  fprintf(fp, "\n\nNode #\t\tT_arr\t\tT_req"
1465 #ifdef PATH_COUNTING
1466  "\tForward weight\tBackward weight"
1467 #endif
1468  "\n\n");
1469 
1470  for (inode = 0; inode < num_tnodes; inode++) {
1471  if (tnode[inode].T_arr > HUGE_NEGATIVE_FLOAT + 1) {
1472  fprintf(fp, "%d\t%12g", inode, tnode[inode].T_arr);
1473  } else {
1474  fprintf(fp, "%d\t\t -", inode);
1475  }
1476  if (tnode[inode].T_req < HUGE_POSITIVE_FLOAT - 1) {
1477  fprintf(fp, "\t%12g", tnode[inode].T_req);
1478  } else {
1479  fprintf(fp, "\t\t -");
1480  }
1481 #ifdef PATH_COUNTING
1482  fprintf(fp, "\t%12g\t%12g\n", tnode[inode].forward_weight, tnode[inode].backward_weight);
1483 #endif
1484  }
1485  }
1486 
1487  fclose(fp);
1488 }
1489 static void process_constraints(void) {
1490  /* Removes all constraints between domains which never intersect. We need to do this
1491  so that criticality_denom in do_timing_analysis is not affected by unused constraints.
1492  BFS through the levelized graph once for each source domain. Whenever we get to a sink,
1493  mark off that we've used that sink clock domain. After each traversal, set all unused
1494  constraints to DO_NOT_ANALYSE.
1495 
1496  Also, print g_sdc->domain_constraints, constrained I/Os and override constraints,
1497  and convert g_sdc->domain_constraints and flip-flop-level override constraints
1498  to be in seconds rather than nanoseconds. We don't need to normalize g_sdc->cc_constraints
1499  because they're already on the g_sdc->domain_constraints matrix, and we don't need
1500  to normalize constrained_ios because we already did the normalization when
1501  we put the delays onto the timing graph in load_clock_domain_and_clock_and_io_delay. */
1502 
1503  int source_clock_domain, sink_clock_domain, inode, ilevel, num_at_level, i,
1504  num_edges, iedge, to_node, icf, ifc, iff;
1505  t_tedge * tedge;
1506  float constraint;
1507  boolean * constraint_used = (boolean *) my_malloc(g_sdc->num_constrained_clocks * sizeof(boolean));
1508 
1509  for (source_clock_domain = 0; source_clock_domain < g_sdc->num_constrained_clocks; source_clock_domain++) {
1510  /* We're going to use arrival time to flag which nodes we've reached,
1511  even though the values we put in will not correspond to actual arrival times.
1512  Nodes which are reached on this traversal will get an arrival time of 0.
1513  Reset arrival times now to an invalid number. */
1514  for (inode = 0; inode < num_tnodes; inode++) {
1515  tnode[inode].T_arr = HUGE_NEGATIVE_FLOAT;
1516  }
1517 
1518  /* Reset all constraint_used entries. */
1519  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
1520  constraint_used[sink_clock_domain] = FALSE;
1521  }
1522 
1523  /* Set arrival times for each top-level tnode on this clock domain. */
1524  num_at_level = tnodes_at_level[0].nelem;
1525  for (i = 0; i < num_at_level; i++) {
1526  inode = tnodes_at_level[0].list[i];
1527  if (tnode[inode].clock_domain == source_clock_domain) {
1528  tnode[inode].T_arr = 0.;
1529  }
1530  }
1531 
1532  for (ilevel = 0; ilevel < num_tnode_levels; ilevel++) { /* Go down one level at a time. */
1533  num_at_level = tnodes_at_level[ilevel].nelem;
1534 
1535  for (i = 0; i < num_at_level; i++) {
1536  inode = tnodes_at_level[ilevel].list[i]; /* Go through each of the tnodes at the level we're on. */
1537  if (has_valid_T_arr(inode)) { /* If this tnode has been used */
1538  num_edges = tnode[inode].num_edges;
1539  if (num_edges == 0) { /* sink */
1540  /* We've reached the sink domain of this tnode, so set constraint_used
1541  to true for this tnode's clock domain (if it has a valid one). */
1542  sink_clock_domain = tnode[inode].clock_domain;
1543  if (sink_clock_domain != -1) {
1544  constraint_used[sink_clock_domain] = TRUE;
1545  }
1546  } else {
1547  /* Set arrival time to a valid value (0.) for each tnode in this tnode's fanout. */
1548  tedge = tnode[inode].out_edges;
1549  for (iedge = 0; iedge < num_edges; iedge++) {
1550  to_node = tedge[iedge].to_node;
1551  tnode[to_node].T_arr = 0.;
1552  }
1553  }
1554  }
1555  }
1556  }
1557 
1558  /* At the end of the source domain traversal, see which sink domains haven't been hit,
1559  and set the constraint for the pair of source and sink domains to DO_NOT_ANALYSE */
1560 
1561  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
1562  if (!constraint_used[sink_clock_domain]) {
1563  g_sdc->domain_constraint[source_clock_domain][sink_clock_domain] = DO_NOT_ANALYSE;
1564  }
1565  }
1566  }
1567 
1568  free(constraint_used);
1569 
1570  /* Print constraints */
1573  }
1574 
1575  /* Convert g_sdc->domain_constraint and ff-level override constraints to be in seconds, not nanoseconds. */
1576  for (source_clock_domain = 0; source_clock_domain < g_sdc->num_constrained_clocks; source_clock_domain++) {
1577  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
1578  constraint = g_sdc->domain_constraint[source_clock_domain][sink_clock_domain];
1579  if (constraint > NEGATIVE_EPSILON) { /* if constraint does not equal DO_NOT_ANALYSE */
1580  g_sdc->domain_constraint[source_clock_domain][sink_clock_domain] = constraint * 1e-9;
1581  }
1582  }
1583  }
1584  for (icf = 0; icf < g_sdc->num_cf_constraints; icf++) {
1585  g_sdc->cf_constraints[icf].constraint *= 1e-9;
1586  }
1587  for (ifc = 0; ifc < g_sdc->num_fc_constraints; ifc++) {
1588  g_sdc->fc_constraints[ifc].constraint *= 1e-9;
1589  }
1590  for (iff = 0; iff < g_sdc->num_ff_constraints; iff++) {
1591  g_sdc->ff_constraints[iff].constraint *= 1e-9;
1592  }
1593 
1594  /* Finally, free g_sdc->cc_constraints since all of its info is contained in g_sdc->domain_constraint. */
1596 }
1597 
1598 static void alloc_timing_stats(void) {
1599 
1600  /* Allocate f_timing_stats data structure. */
1601 
1602  int i;
1603 
1604  f_timing_stats = (t_timing_stats *) my_malloc(sizeof(t_timing_stats));
1605  f_timing_stats->cpd = (float **) my_malloc(g_sdc->num_constrained_clocks * sizeof(float *));
1606  f_timing_stats->least_slack = (float **) my_malloc(g_sdc->num_constrained_clocks * sizeof(float *));
1607  for (i = 0; i < g_sdc->num_constrained_clocks; i++) {
1608  f_timing_stats->cpd[i] = (float *) my_malloc(g_sdc->num_constrained_clocks * sizeof(float));
1609  f_timing_stats->least_slack[i] = (float *) my_malloc(g_sdc->num_constrained_clocks * sizeof(float));
1610  }
1611 }
1612 
1613 void do_timing_analysis(t_slack * slacks, boolean is_prepacked, boolean do_lut_input_balancing, boolean is_final_analysis) {
1614 
1615 /* Performs timing analysis on the circuit. Before this routine is called, t_slack * slacks
1616  must have been allocated, and the circuit must have been converted into a timing graph.
1617  The nodes of the timing graph represent pins and the edges between them represent delays
1618  and m dependencies from one pin to another. Most elements are modeled as a pair of nodes so
1619  that the delay through the element can be marked on the edge between them (e.g.
1620  TN_INPAD_SOURCE->TN_INPAD_OPIN, TN_OUTPAD_IPIN->TN_OUTPAD_SINK, TN_PRIMITIVE_OPIN->
1621  TN_PRIMITIVE_OPIN, etc.).
1622 
1623  The timing graph nodes are stored as an array, tnode [0..num_tnodes - 1]. Each tnode
1624  includes an array of all edges, tedge, which fan out from it. Each tedge includes the
1625  index of the node on its far end (in the tnode array), and the delay to that node.
1626 
1627  The timing graph has sources at each TN_FF_SOURCE (Q output), TN_INPAD_SOURCE (input I/O pad)
1628  and TN_CONSTANT_GEN_SOURCE (constant 1 or 0 generator) node and sinks at TN_FF_SINK (D input)
1629  and TN_OUTPAD_SINK (output I/O pad) nodes. Two traversals, one forward (sources to sinks)
1630  and one backward, are performed for each valid constraint (one which is not DO_NOT_ANALYSE)
1631  between a source and a sink clock domain in the matrix g_sdc->domain_constraint
1632  [0..g_sdc->num_constrained_clocks - 1][0..g_sdc->num_constrained_clocks - 1]. This matrix has been
1633  pruned so that all domain pairs with no paths between them have been set to DO_NOT_ANALYSE.
1634 
1635  During the traversal pair for each constraint, all nodes in the fanout of sources on the
1636  source clock domain are assigned a T_arr, the arrival time of the last input signal to the node.
1637  All nodes in the fanin of sinks on the sink clock domain are assigned a T_req, the required
1638  arrival time of the last input signal to the node if the critical path for this constraint is
1639  not to be lengthened. Nodes which receive both a valid T_arr and T_req are flagged with
1640  used_on_this_traversal, and nodes which are used on at least one traversal pair are flagged
1641  with has_valid_slack so that later functions know the slack is valid.
1642 
1643  After each traversal pair, a slack is calculated for each sink pin on each net (or equivalently,
1644  each connection or tedge fanning out from that net's driver tnode). Slack is calculated as:
1645  T_req (dest node) - T_arr (source node) - Tdel (edge)
1646  and represents the amount of delay which could be added to this connection before the critical
1647  path delay for this constraint would worsen. Edges on the critical path have a slack of 0.
1648  Slacks which are never used are set to HUGE_POSITIVE_FLOAT.
1649 
1650  The optimizers actually use a metric called timing_criticality. Timing criticality is defined
1651  as 1 - slack / criticality_denom, where the normalization factor criticality_denom is the max
1652  of all arrival times in the constraint and the constraint itself (T_req-relaxed slacks) or
1653  all arrival times and constraints in the design (shifted slacks). See below for a further
1654  discussion of these two regimes. Timing criticality is always between 0 (not critical) and 1
1655  (very critical). Unlike slack, which is set to HUGE_POSITIVE_FLOAT for unanalysed connections,
1656  timing criticality is 0 for these, meaning no special check has to be made for which connections
1657  have been analysed.
1658 
1659  If path counting is on (PATH_COUNTING is defined in vpr_types.h), the optimizers use a weighted
1660  sum of timing_criticality and path_criticality, the latter of which is an estimate of the
1661  importance of the number of paths using a particular connection. As a path's timing_criticality
1662  decreases, it will become exponentially less important to the path_criticality of any connection
1663  which this path uses. Path criticality also goes from 0 (not critical or unanalysed) to 1.
1664 
1665  Slack and criticalities are only calculated if both the driver of the net and the sink pin were
1666  used_on_this_traversal, and are only overwritten if lower than previously-obtained values.
1667  The optimizers actually use criticality rather than slack, but slack is useful for human
1668  designers and so we calculate it only if we need to print it.
1669 
1670  This routine outputs slack and criticality to t_slack * slacks. It also stores least slack and
1671  critical path delay per constraint [0..g_sdc->num_constrained_clocks - 1][0..g_sdc->num_constrained_clocks - 1]
1672  in the file-scope variable f_timing_stats.
1673 
1674  Is_prepacked flags whether to calculate normalized costs for the clusterer (normalized_slack,
1675  normalized_Tarr, normalized_total_critical_paths). Setting this to FALSE saves time in post-
1676  packed timing analyses.
1677 
1678  Do_lut_input_balancing flags whether to rebalance LUT inputs. LUT rebalancing takes advantage of
1679  the fact that different LUT inputs often have different delays. Since we can freely permute which
1680  LUT inputs are used by just changing the logic in the LUT, these LUT permutations can be performed
1681  late into the routing stage of the flow.
1682 
1683  Is_final_analysis flags whether this is the final, analysis pass. If it is, the analyser will
1684  compute actual slacks instead of relaxed ones. We "relax" slacks by setting the required time to
1685  the maximum arrival time for tight constraints so that no slacks are negative (which confuses
1686  the optimizers). This is called "T_req-relaxed" slack. However, human designers want to see
1687  actual slack values, so we report those in the final analysis. The alternative way of making
1688  slacks positive is shifting them upwards by the value of the largest negative slack, after
1689  all traversals are complete ("shifted slacks"), which can be enabled by changing SLACK_DEFINITION
1690  from 'R' to 'S' in path_delay.h.
1691 
1692  To do: flip-flop to flip-flop and flip-flop to clock domain constraints (set_false_path, set_max_delay,
1693  and especially set_multicycle_path). All the info for these constraints is contained in g_sdc->fc_constraints and
1694  g_sdc->ff_constraints, but graph traversals are not included yet. Probably, an entire traversal will be needed for
1695  each constraint. Clock domain to flip-flop constraints are coded but not tested, and are done within
1696  existing traversals. */
1697 
1698  int i, j, source_clock_domain, sink_clock_domain, inode, inet, ipin;
1699 
1700 #if defined PATH_COUNTING || SLACK_DEFINITION == 'S'
1701  int iedge, num_edges;
1702 #endif
1703 
1704 #ifdef PATH_COUNTING
1705  float max_path_criticality = HUGE_NEGATIVE_FLOAT /* used to normalize path_criticalities */;
1706 #endif
1707 
1708  boolean update_slack = (boolean) (is_final_analysis || getEchoEnabled());
1709  /* Only update slack values if we need to print it, i.e.
1710  for the final output file (is_final_analysis) or echo files. */
1711 
1712  float criticality_denom; /* (SLACK_DEFINITION == 'R' only) For a particular constraint, the maximum of
1713  the constraint and all arrival times for the constraint's traversal. Used to
1714  normalize the clusterer's normalized_slack and, more importantly, criticality. */
1715 
1716  long max_critical_output_paths, max_critical_input_paths;
1717  t_pb *pb;
1718 
1719 #if SLACK_DEFINITION == 'S'
1720  float smallest_slack_in_design = HUGE_POSITIVE_FLOAT;
1721  /* Shift all slacks upwards by this number if it is negative. */
1722 
1723  float criticality_denom_global = HUGE_NEGATIVE_FLOAT;
1724  /* Denominator of criticality for shifted - max of all arrival times and all constraints. */
1725 #endif
1726 
1727  /* Reset LUT input rebalancing. */
1728  for (inode = 0; inode < num_tnodes; inode++) {
1729  if (tnode[inode].type == TN_PRIMITIVE_OPIN && tnode[inode].pb_graph_pin != NULL) {
1730  pb = block[tnode[inode].block].pb->rr_node_to_pb_mapping[tnode[inode].pb_graph_pin->pin_count_in_cluster];
1731  if (pb != NULL && pb->lut_pin_remap != NULL) {
1732  /* this is a LUT primitive, do pin swapping */
1733  assert(pb->pb_graph_node->pb_type->num_output_pins == 1 && pb->pb_graph_node->pb_type->num_clock_pins == 0); /* ensure LUT properties are valid */
1734  assert(pb->pb_graph_node->num_input_ports == 1);
1735  /* If all input pins are known, perform LUT input delay rebalancing, do nothing otherwise */
1736  for (i = 0; i < pb->pb_graph_node->num_input_pins[0]; i++) {
1737  pb->lut_pin_remap[i] = OPEN;
1738  }
1739  }
1740  }
1741  }
1742 
1743  /* Reset all values which need to be reset once per
1744  timing analysis, rather than once per traversal pair. */
1745 
1746  /* Reset slack and criticality */
1747  for (inet = 0; inet < num_timing_nets; inet++) {
1748  for (ipin = 1; ipin <= timing_nets[inet].num_sinks; ipin++) {
1749  slacks->slack[inet][ipin] = HUGE_POSITIVE_FLOAT;
1750  slacks->timing_criticality[inet][ipin] = 0.;
1751 #ifdef PATH_COUNTING
1752  slacks->path_criticality[inet][ipin] = 0.;
1753 #endif
1754  }
1755  }
1756 
1757  /* Reset f_timing_stats. */
1758  for (i = 0; i < g_sdc->num_constrained_clocks; i++) {
1759  for (j = 0; j < g_sdc->num_constrained_clocks; j++) {
1760  f_timing_stats->cpd[i][j] = HUGE_NEGATIVE_FLOAT;
1761  f_timing_stats->least_slack[i][j] = HUGE_POSITIVE_FLOAT;
1762  }
1763  }
1764 
1765 #ifndef PATH_COUNTING
1766  /* Reset normalized values for clusterer. */
1767  if (is_prepacked) {
1768  for (inode = 0; inode < num_tnodes; inode++) {
1772  }
1773  }
1774 #endif
1775 
1776  if (do_lut_input_balancing) {
1778  }
1779 
1780  /* For each valid constraint (pair of source and sink clock domains), we do one
1781  forward and one backward topological traversal to find arrival and required times,
1782  in do_timing_analysis_for_constraint. If path counting is on, we then do another,
1783  simpler traversal to find forward and backward weights, relying on the largest
1784  required time we found from the first traversal. After each constraint's traversals,
1785  we update the slacks, timing criticalities and (if necessary) path criticalities or
1786  normalized costs used by the clusterer. */
1787 
1788  for (source_clock_domain = 0; source_clock_domain < g_sdc->num_constrained_clocks; source_clock_domain++) {
1789  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
1790  if (g_sdc->domain_constraint[source_clock_domain][sink_clock_domain] > NEGATIVE_EPSILON) { /* i.e. != DO_NOT_ANALYSE */
1791 
1792  /* Perform the forward and backward traversal for this constraint. */
1793  criticality_denom = do_timing_analysis_for_constraint(source_clock_domain, sink_clock_domain,
1794  is_prepacked, is_final_analysis, &max_critical_input_paths, &max_critical_output_paths);
1795 #ifdef PATH_COUNTING
1796  /* Weight the importance of each net, used in slack calculation. */
1797  do_path_counting(criticality_denom);
1798 #endif
1799 
1800  /* Update the slack and criticality for each edge of each net which was
1801  analysed on the most recent traversal and has a lower (slack) or
1802  higher (criticality) value than before. */
1803  update_slacks(slacks, source_clock_domain, sink_clock_domain, criticality_denom, update_slack);
1804 
1805 #ifndef PATH_COUNTING
1806  /* Update the normalized costs used by the clusterer. */
1807  if (is_prepacked) {
1808  update_normalized_costs(criticality_denom, max_critical_input_paths, max_critical_output_paths);
1809  }
1810 #endif
1811 
1812 #if SLACK_DEFINITION == 'S'
1813  /* Set criticality_denom_global to the max of criticality_denom over all traversals. */
1814  criticality_denom_global = std::max(criticality_denom_global, criticality_denom);
1815 #endif
1816  }
1817  }
1818  }
1819 
1820 #ifdef PATH_COUNTING
1821  /* Normalize path criticalities by the largest value in the
1822  circuit. Otherwise, path criticalities would be unbounded. */
1823 
1824  for (inet = 0; inet < num_timing_nets; inet++) {
1825  num_edges = timing_nets[inet].num_sinks;
1826  for (iedge = 0; iedge < num_edges; iedge++) {
1827  max_path_criticality = std::max(max_path_criticality, slacks->path_criticality[inet][iedge + 1]);
1828  }
1829  }
1830 
1831  for (inet = 0; inet < num_timing_nets; inet++) {
1832  num_edges = timing_nets[inet].num_sinks;
1833  for (iedge = 0; iedge < num_edges; iedge++) {
1834  slacks->path_criticality[inet][iedge + 1] /= max_path_criticality;
1835  }
1836  }
1837 
1838 #endif
1839 
1840 #if SLACK_DEFINITION == 'S'
1841  if (!is_final_analysis) {
1842 
1843  /* Find the smallest slack in the design. */
1844  for (i = 0; i < g_sdc->num_constrained_clocks; i++) {
1845  for (j = 0; j < g_sdc->num_constrained_clocks; j++) {
1846  smallest_slack_in_design = std::min(smallest_slack_in_design, f_timing_stats->least_slack[i][j]);
1847  }
1848  }
1849 
1850  /* Increase all slacks by the value of the smallest slack in the design, if it's negative. */
1851  if (smallest_slack_in_design < 0) {
1852  for (inet = 0; inet < num_timing_nets; inet++) {
1853  num_edges = timing_nets[inet].num_sinks;
1854  for (iedge = 0; iedge < num_edges; iedge++) {
1855  slacks->slack[inet][iedge + 1] -= smallest_slack_in_design;
1856  /* Remember, smallest_slack_in_design is negative, so we're INCREASING all the slacks. */
1857  /* Note that if slack was equal to HUGE_POSITIVE_FLOAT, it will still be equal to more than this,
1858  so it will still be ignored when we calculate criticalities. */
1859  }
1860  }
1861  }
1862  }
1863 
1864  /* We can now calculate criticalities, only after we normalize slacks. */
1865  for (inet = 0; inet < num_timing_nets; inet++) {
1866  num_edges = timing_nets[inet].num_sinks;
1867  for (iedge = 0; iedge < num_edges; iedge++) {
1868  if (slacks->slack[inet][iedge + 1] < HUGE_POSITIVE_FLOAT - 1) { /* if the slack is valid */
1869  slacks->timing_criticality[inet][iedge + 1] = 1 - slacks->slack[inet][iedge + 1]/criticality_denom_global;
1870  }
1871  /* otherwise, criticality remains 0, as it was initialized */
1872  }
1873  }
1874 #endif
1875 }
1876 
1877 static void do_lut_rebalancing() {
1878 
1879  int inode, num_at_level, i, ilevel, num_edges, iedge, to_node;
1880  t_tedge * tedge;
1881 
1882  /* Reset all arrival times to a very large negative number. */
1883  for (inode = 0; inode < num_tnodes; inode++) {
1884  tnode[inode].T_arr = HUGE_NEGATIVE_FLOAT;
1885  }
1886 
1887  /* Set arrival times for each top-level tnode. */
1888  num_at_level = tnodes_at_level[0].nelem;
1889  for (i = 0; i < num_at_level; i++) {
1890  inode = tnodes_at_level[0].list[i];
1891  if (tnode[inode].type == TN_FF_SOURCE) {
1892  /* Set the arrival time of this flip-flop tnode to its clock skew. */
1893  tnode[inode].T_arr = tnode[inode].clock_delay;
1894  } else if (tnode[inode].type == TN_INPAD_SOURCE) {
1895  /* There's no such thing as clock skew for external clocks. The closest equivalent,
1896  input delay, is already marked on the edge coming out from this node.
1897  As a result, the signal can be said to arrive at t = 0. */
1898  tnode[inode].T_arr = 0.;
1899  }
1900  }
1901 
1902  /* Now we actually start the forward topological traversal, to compute arrival times. */
1903  for (ilevel = 0; ilevel < num_tnode_levels; ilevel++) { /* For each level of our levelized timing graph... */
1904  num_at_level = tnodes_at_level[ilevel].nelem; /* ...there are num_at_level tnodes at that level. */
1905 
1906  for (i = 0; i < num_at_level; i++) {
1907  inode = tnodes_at_level[ilevel].list[i]; /* Go through each of the tnodes at the level we're on. */
1908 
1909  num_edges = tnode[inode].num_edges; /* Get the number of edges fanning out from the node we're visiting */
1910  tedge = tnode[inode].out_edges; /* Get the list of edges from the node we're visiting */
1911 
1912  for (iedge = 0; iedge < num_edges; iedge++) { /* Now go through each edge coming out from this tnode */
1913  to_node = tedge[iedge].to_node; /* Get the index of the destination tnode of this edge. */
1914 
1915  /* The arrival time T_arr at the destination node is set to the maximum of all the
1916  possible arrival times from all edges fanning in to the node.
1917  The arrival time represents the latest time that all inputs must arrive at a node.
1918  LUT input rebalancing also occurs at this step. */
1919  set_and_balance_arrival_time(to_node, inode, tedge[iedge].Tdel, TRUE);
1920  }
1921  }
1922  }
1923 }
1924 
1925 
1926 static float do_timing_analysis_for_constraint(int source_clock_domain, int sink_clock_domain,
1927  boolean is_prepacked, boolean is_final_analysis, long * max_critical_input_paths_ptr,
1928  long * max_critical_output_paths_ptr) {
1929 
1930  /* Performs a single forward and backward traversal for the domain pair
1931  source_clock_domain and sink_clock_domain. Returns the denominator that
1932  will be later used to normalize criticality - the maximum of all arrival
1933  times from this traversal and the constraint for this pair of domains.
1934  Also returns the maximum number of critical input and output paths of any
1935  node analysed for this constraint, passed by reference from do_timing_analysis. */
1936 
1937  int inode, num_at_level, i, total, ilevel, num_edges, iedge, to_node, icf;
1938  float constraint, Tdel, T_req, max_Tarr = HUGE_NEGATIVE_FLOAT;
1939  /* Max of all arrival times for this constraint -
1940  used to relax required times. */
1941  t_tedge * tedge;
1942  int num_dangling_nodes;
1943  boolean found;
1944  long max_critical_input_paths = 0, max_critical_output_paths = 0;
1945 
1946  /* Reset all values which need to be reset once per
1947  traversal pair, rather than once per timing analysis. */
1948 
1949  /* Reset all arrival and required times. */
1950  for (inode = 0; inode < num_tnodes; inode++) {
1951  tnode[inode].T_arr = HUGE_NEGATIVE_FLOAT;
1952  tnode[inode].T_req = HUGE_POSITIVE_FLOAT;
1953  }
1954 
1955 #ifndef PATH_COUNTING
1956  /* Reset num_critical_output_paths. */
1957  if (is_prepacked) {
1958  for (inode = 0; inode < num_tnodes; inode++) {
1959  tnode[inode].prepacked_data->num_critical_output_paths = 0;
1960  }
1961  }
1962 #endif
1963 
1964  /* Set arrival times for each top-level tnode on this source domain. */
1965  num_at_level = tnodes_at_level[0].nelem;
1966  for (i = 0; i < num_at_level; i++) {
1967  inode = tnodes_at_level[0].list[i];
1968 
1969  if (tnode[inode].clock_domain == source_clock_domain) {
1970 
1971  if (tnode[inode].type == TN_FF_SOURCE) {
1972  /* Set the arrival time of this flip-flop tnode to its clock skew. */
1973  tnode[inode].T_arr = tnode[inode].clock_delay;
1974 
1975  } else if (tnode[inode].type == TN_INPAD_SOURCE) {
1976  /* There's no such thing as clock skew for external clocks, and
1977  input delay is already marked on the edge coming out from this node.
1978  As a result, the signal can be said to arrive at t = 0. */
1979  tnode[inode].T_arr = 0.;
1980  }
1981 
1982  }
1983  }
1984 
1985  /* Compute arrival times with a forward topological traversal from sources
1986  (TN_FF_SOURCE, TN_INPAD_SOURCE, TN_CONSTANT_GEN_SOURCE) to sinks (TN_FF_SINK, TN_OUTPAD_SINK). */
1987 
1988  total = 0; /* We count up all tnodes to error-check at the end. */
1989  for (ilevel = 0; ilevel < num_tnode_levels; ilevel++) { /* For each level of our levelized timing graph... */
1990  num_at_level = tnodes_at_level[ilevel].nelem; /* ...there are num_at_level tnodes at that level. */
1991  total += num_at_level;
1992 
1993  for (i = 0; i < num_at_level; i++) {
1994  inode = tnodes_at_level[ilevel].list[i]; /* Go through each of the tnodes at the level we're on. */
1995  if (tnode[inode].T_arr < NEGATIVE_EPSILON) { /* If the arrival time is less than 0 (i.e. HUGE_NEGATIVE_FLOAT)... */
1996  continue; /* End this iteration of the num_at_level for loop since
1997  this node is not part of the clock domain we're analyzing.
1998  (If it were, it would have received an arrival time already.) */
1999  }
2000 
2001  num_edges = tnode[inode].num_edges; /* Get the number of edges fanning out from the node we're visiting */
2002  tedge = tnode[inode].out_edges; /* Get the list of edges from the node we're visiting */
2003 #ifndef PATH_COUNTING
2004  if (is_prepacked && ilevel == 0) {
2005  tnode[inode].prepacked_data->num_critical_input_paths = 1; /* Top-level tnodes have one locally-critical input path. */
2006  }
2007 
2008  /* Using a somewhat convoluted procedure inherited from T-VPack,
2009  count how many locally-critical input paths fan into each tnode,
2010  and also find the maximum number over all tnodes. */
2011  if (is_prepacked) {
2012  for (iedge = 0; iedge < num_edges; iedge++) {
2013  to_node = tedge[iedge].to_node;
2014  if (fabs(tnode[to_node].T_arr - (tnode[inode].T_arr + tedge[iedge].Tdel)) < EPSILON) {
2015  /* If the "local forward slack" (T_arr(to_node) - T_arr(inode) - T_del) for this edge
2016  is 0 (i.e. the path from inode to to_node is locally as critical as any other path to
2017  to_node), add to_node's num critical input paths to inode's number. */
2019  } else if (tnode[to_node].T_arr < (tnode[inode].T_arr + tedge[iedge].Tdel)) {
2020  /* If the "local forward slack" for this edge is negative,
2021  reset to_node's num critical input paths to inode's number. */
2023  }
2024  /* Set max_critical_input_paths to the maximum number of critical
2025  input paths for all tnodes analysed on this traversal. */
2026  if (tnode[to_node].prepacked_data->num_critical_input_paths > max_critical_input_paths) {
2027  max_critical_input_paths = tnode[to_node].prepacked_data->num_critical_input_paths;
2028  }
2029  }
2030  }
2031 #endif
2032  for (iedge = 0; iedge < num_edges; iedge++) { /* Now go through each edge coming out from this tnode */
2033  to_node = tedge[iedge].to_node; /* Get the index of the destination tnode of this edge. */
2034 
2035  /* The arrival time T_arr at the destination node is set to the maximum of all
2036  the possible arrival times from all edges fanning in to the node. The arrival
2037  time represents the latest time that all inputs must arrive at a node. LUT input
2038  rebalancing also occurs at this step. */
2039  set_and_balance_arrival_time(to_node, inode, tedge[iedge].Tdel, FALSE);
2040 
2041  /* Since we updated the destination node (to_node), change the max arrival
2042  time for the forward traversal if to_node's arrival time is greater than
2043  the existing maximum. */
2044  max_Tarr = std::max(max_Tarr, tnode[to_node].T_arr);
2045  }
2046  }
2047  }
2048 
2049  assert(total == num_tnodes);
2050  num_dangling_nodes = 0;
2051  /* Compute required times with a backward topological traversal from sinks to sources. */
2052 
2053  for (ilevel = num_tnode_levels - 1; ilevel >= 0; ilevel--) {
2054  num_at_level = tnodes_at_level[ilevel].nelem;
2055 
2056  for (i = 0; i < num_at_level; i++) {
2057  inode = tnodes_at_level[ilevel].list[i];
2058  num_edges = tnode[inode].num_edges;
2059 
2060  if (ilevel == 0) {
2061  if (!(tnode[inode].type == TN_INPAD_SOURCE || tnode[inode].type == TN_FF_SOURCE || tnode[inode].type == TN_CONSTANT_GEN_SOURCE)) {
2062  vpr_printf(TIO_MESSAGE_ERROR, "Timing graph started on unexpected node %s.%s[%d].\n",
2063  tnode[inode].pb_graph_pin->parent_node->pb_type->name,
2064  tnode[inode].pb_graph_pin->port->name,
2065  tnode[inode].pb_graph_pin->pin_number);
2066  vpr_printf(TIO_MESSAGE_ERROR, "This is a VPR internal error, contact VPR development team.\n");
2067  exit(1);
2068  }
2069  } else {
2070  if ((tnode[inode].type == TN_INPAD_SOURCE || tnode[inode].type == TN_FF_SOURCE || tnode[inode].type == TN_CONSTANT_GEN_SOURCE)) {
2071  vpr_printf(TIO_MESSAGE_ERROR, "Timing graph discovered unexpected edge to node %s.%s[%d].\n",
2072  tnode[inode].pb_graph_pin->parent_node->pb_type->name,
2073  tnode[inode].pb_graph_pin->port->name,
2074  tnode[inode].pb_graph_pin->pin_number);
2075  vpr_printf(TIO_MESSAGE_ERROR, "This is a VPR internal error, contact VPR development team.\n");
2076  exit(1);
2077  }
2078  }
2079 
2080  /* Unlike the forward traversal, the sinks are all on different levels, so we always have to
2081  check whether a node is a sink. We give every sink on the sink clock domain we're considering
2082  a valid required time. Every non-sink node in the fanin of one of these sinks and the fanout of
2083  some source from the forward traversal also gets a valid required time. */
2084 
2085  if (num_edges == 0) { /* sink */
2086 
2087  if (tnode[inode].type == TN_FF_CLOCK || tnode[inode].T_arr < HUGE_NEGATIVE_FLOAT + 1) {
2088  continue; /* Skip nodes on the clock net itself, and nodes with unset arrival times. */
2089  }
2090 
2091  if (!(tnode[inode].type == TN_OUTPAD_SINK || tnode[inode].type == TN_FF_SINK)) {
2092  if(is_prepacked) {
2093  vpr_printf(TIO_MESSAGE_WARNING, "Pin on block %s.%s[%d] not used\n",
2094  logical_block[tnode[inode].block].name,
2095  tnode[inode].prepacked_data->model_port_ptr->name,
2096  tnode[inode].prepacked_data->model_pin);
2097  }
2098  num_dangling_nodes++;
2099  /* Note: Still need to do standard traversals with dangling pins so that algorithm runs properly, but T_arr and T_Req to values such that it dangling nodes do not affect actual timing values */
2100  }
2101 
2102  /* Skip nodes not on the sink clock domain of the
2103  constraint we're currently considering */
2104  if (tnode[inode].clock_domain != sink_clock_domain) {
2105  continue;
2106  }
2107 
2108  /* See if there's an override constraint between the source clock domain (name is
2109  g_sdc->constrained_clocks[source_clock_domain].name) and the flip-flop or outpad we're at
2110  now (name is find_tnode_net_name(inode, is_prepacked)). We test if
2111  g_sdc->num_cf_constraints > 0 first so that we can save time looking up names in the vast
2112  majority of cases where there are no such constraints. */
2113 
2114  if (g_sdc->num_cf_constraints > 0 && (icf = find_cf_constraint(g_sdc->constrained_clocks[source_clock_domain].name, find_tnode_net_name(inode, is_prepacked))) != -1) {
2115  constraint = g_sdc->cf_constraints[icf].constraint;
2116  if (constraint < NEGATIVE_EPSILON) {
2117  /* Constraint is DO_NOT_ANALYSE (-1) for this particular sink. */
2118  continue;
2119  }
2120  } else { /* Use the default constraint from g_sdc->domain_constraint. */
2121  constraint = g_sdc->domain_constraint[source_clock_domain][sink_clock_domain];
2122  /* Constraint is guaranteed to be valid since we checked for it at the very beginning. */
2123  }
2124 
2125  /* Now we know we should analyse this tnode. */
2126 
2127 #if SLACK_DEFINITION == 'R'
2128  /* Assign the required time T_req for this leaf node, taking into account clock skew. T_req is the
2129  time all inputs to a tnode must arrive by before it would degrade this constraint's critical path delay.
2130 
2131  Relax the required time at the sink node to be non-negative by taking the max of the "real" required
2132  time (constraint + tnode[inode].clock_delay) and the max arrival time in this domain (max_Tarr), except
2133  for the final analysis where we report actual slack. We do this to prevent any slacks from being
2134  negative, since negative slacks are not used effectively by the optimizers.
2135 
2136  E.g. if we have a 10 ns constraint and it takes 14 ns to get here, we'll have a slack of at most -4 ns
2137  for any edge along the path that got us here. If we say the required time is 14 ns (no less than the
2138  arrival time), we don't have a negative slack anymore. However, in the final timing analysis, the real
2139  slacks are computed (that's what human designers care about), not the relaxed ones. */
2140 
2141  if (is_final_analysis) {
2142  tnode[inode].T_req = constraint + tnode[inode].clock_delay;
2143  } else {
2144  tnode[inode].T_req = std::max(constraint + tnode[inode].clock_delay, max_Tarr);
2145  }
2146 #else
2147  /* Don't do the relaxation and always set T_req equal to the "real" required time. */
2148  tnode[inode].T_req = constraint + tnode[inode].clock_delay;
2149 #endif
2150 
2151  /* Store the largest critical path delay for this constraint (source domain AND sink domain)
2152  in the matrix critical_path_delay. C.P.D. = T_arr at destination - clock skew at destination
2153  = (datapath delay + clock delay to source) - clock delay to destination.
2154 
2155  Critical path delay is really telling us how fast we can run the source clock before we can no
2156  longer meet this constraint. e.g. If the datapath delay is 10 ns, the clock delay at source is
2157  2 ns and the clock delay at destination is 5 ns, then C.P.D. is 7 ns by the above formula. We
2158  can run the source clock at 7 ns because the clock skew gives us 3 ns extra to meet the 10 ns
2159  datapath delay. */
2160 
2161  f_timing_stats->cpd[source_clock_domain][sink_clock_domain] =
2162  std::max(f_timing_stats->cpd[source_clock_domain][sink_clock_domain],
2163  (tnode[inode].T_arr - tnode[inode].clock_delay));
2164 
2165 #ifndef PATH_COUNTING
2166  if (is_prepacked) {
2167  tnode[inode].prepacked_data->num_critical_output_paths = 1; /* Bottom-level tnodes have one locally-critical input path. */
2168  }
2169 #endif
2170  } else { /* not a sink */
2171 
2172  assert(!(tnode[inode].type == TN_OUTPAD_SINK || tnode[inode].type == TN_FF_SINK || tnode[inode].type == TN_FF_CLOCK));
2173 
2174  /* We need to skip this node unless it is on a path from source_clock_domain to
2175  sink_clock_domain. We need to skip all nodes which:
2176  1. Fan out to the sink domain but do not fan in from the source domain.
2177  2. Fan in from the source domain but do not fan out to the sink domain.
2178  3. Do not fan in or out to either domain.
2179  If a node does not fan in from the source domain, it will not have a valid arrival time.
2180  So cases 1 and 3 can be skipped by continuing if T_arr = HUGE_NEGATIVE_FLOAT. We cannot
2181  treat case 2 as simply since the required time for this node has not yet been assigned,
2182  so we have to look at the required time for every node in its immediate fanout instead. */
2183 
2184  /* Cases 1 and 3 */
2185  if (tnode[inode].T_arr < HUGE_NEGATIVE_FLOAT + 1) {
2186  continue; /* Skip nodes with unset arrival times. */
2187  }
2188 
2189  /* Case 2 */
2190  found = FALSE;
2191  tedge = tnode[inode].out_edges;
2192  for (iedge = 0; iedge < num_edges && !found; iedge++) {
2193  to_node = tedge[iedge].to_node;
2194  if (tnode[to_node].T_req < HUGE_POSITIVE_FLOAT) {
2195  found = TRUE;
2196  }
2197  }
2198  if (!found) {
2199  continue;
2200  }
2201 
2202  /* Now we know this node is on a path from source_clock_domain to
2203  sink_clock_domain, and needs to be analyzed. */
2204 
2205  /* Opposite to T_arr, set T_req to the MINIMUM of the
2206  required times of all edges fanning OUT from this node. */
2207  for (iedge = 0; iedge < num_edges; iedge++) {
2208  to_node = tedge[iedge].to_node;
2209  Tdel = tedge[iedge].Tdel;
2210  T_req = tnode[to_node].T_req;
2211  tnode[inode].T_req = std::min(tnode[inode].T_req, T_req - Tdel);
2212 
2213  /* Update least slack per constraint. This is NOT the same as the minimum slack we will
2214  calculate on this traversal for post-packed netlists, which only count inter-cluster
2215  slacks. We only look at edges adjacent to sink nodes on the sink clock domain since
2216  all paths go through one of these edges. */
2217  if (tnode[to_node].num_edges == 0 && tnode[to_node].clock_domain == sink_clock_domain) {
2218  f_timing_stats->least_slack[source_clock_domain][sink_clock_domain] =
2219  std::min(f_timing_stats->least_slack[source_clock_domain][sink_clock_domain],
2220  (T_req - Tdel - tnode[inode].T_arr));
2221  }
2222  }
2223 #ifndef PATH_COUNTING
2224 
2225  /* Similar to before, we count how many locally-critical output paths fan out from each tnode,
2226  and also find the maximum number over all tnodes. Unlike for input paths, where we haven't set
2227  the arrival time at to_node before analysing it, here the required time is set at both nodes,
2228  so the "local backward slack" (T_req(to_node) - T_req(inode) - T_del) will never be negative.
2229  Hence, we only have to test if the "local backward slack" is 0. */
2230  if (is_prepacked) {
2231  for (iedge = 0; iedge < num_edges; iedge++) {
2232  to_node = tedge[iedge].to_node;
2233  /* If the "local backward slack" (T_arr(to_node) - T_arr(inode) - T_del) for this edge
2234  is 0 (i.e. the path from inode to to_node is locally as critical as any other path to
2235  to_node), add to_node's num critical output paths to inode's number. */
2236  if (fabs(tnode[to_node].T_req - (tnode[inode].T_req + tedge[iedge].Tdel)) < EPSILON) {
2238  }
2239  /* Set max_critical_output_paths to the maximum number of critical
2240  output paths for all tnodes analysed on this traversal. */
2241  if (tnode[to_node].prepacked_data->num_critical_output_paths > max_critical_output_paths) {
2242  max_critical_output_paths = tnode[to_node].prepacked_data->num_critical_output_paths;
2243  }
2244  }
2245  }
2246 #endif
2247  }
2248  }
2249  }
2250 
2251  /* Return max critical input/output paths for this constraint through
2252  the pointers we passed in. */
2253  if (max_critical_input_paths_ptr && max_critical_output_paths_ptr) {
2254  *max_critical_input_paths_ptr = max_critical_input_paths;
2255  *max_critical_output_paths_ptr = max_critical_output_paths;
2256  }
2257 
2258  if(num_dangling_nodes > 0 && (is_final_analysis || is_prepacked)) {
2259  vpr_printf(TIO_MESSAGE_WARNING, "%d unused pins \n", num_dangling_nodes);
2260  }
2261 
2262  /* The criticality denominator is the maximum of the max
2263  arrival time and the constraint for this domain pair. */
2264  return std::max(max_Tarr, g_sdc->domain_constraint[source_clock_domain][sink_clock_domain]);
2265 }
2266 #ifdef PATH_COUNTING
2267 static void do_path_counting(float criticality_denom) {
2268  /* Count the importance of the number of paths going through each net
2269  by giving each net a forward and backward path weight. This is the first step of
2270  "A Novel Net Weighting Algorithm for Timing-Driven Placement" (Kong, 2002).
2271  We only visit nodes with set arrival and required times, so this function
2272  must be called after do_timing_analysis_for_constraints, which sets T_arr and T_req. */
2273 
2274  int inode, num_at_level, i, ilevel, num_edges, iedge, to_node;
2275  t_tedge * tedge;
2276  float forward_local_slack, backward_local_slack, discount;
2277 
2278  /* Reset forward and backward weights for all tnodes. */
2279  for (inode = 0; inode < num_tnodes; inode++) {
2280  tnode[inode].forward_weight = 0;
2281  tnode[inode].backward_weight = 0;
2282  }
2283 
2284  /* Set foward weights for each top-level tnode. */
2285  num_at_level = tnodes_at_level[0].nelem;
2286  for (i = 0; i < num_at_level; i++) {
2287  inode = tnodes_at_level[0].list[i];
2288  tnode[inode].forward_weight = 1.;
2289  }
2290 
2291  /* Do a forward topological traversal to populate forward weights. */
2292  for (ilevel = 0; ilevel < num_tnode_levels; ilevel++) {
2293  num_at_level = tnodes_at_level[ilevel].nelem;
2294 
2295  for (i = 0; i < num_at_level; i++) {
2296  inode = tnodes_at_level[ilevel].list[i];
2297  if (!(has_valid_T_arr(inode) && has_valid_T_req(inode))) {
2298  continue;
2299  }
2300  tedge = tnode[inode].out_edges;
2301  num_edges = tnode[inode].num_edges;
2302  for (iedge = 0; iedge < num_edges; iedge++) {
2303  to_node = tedge[iedge].to_node;
2304  if (!(has_valid_T_arr(to_node) && has_valid_T_req(to_node))) {
2305  continue;
2306  }
2307  forward_local_slack = tnode[to_node].T_arr - tnode[inode].T_arr - tedge[iedge].Tdel;
2308  discount = pow((float) DISCOUNT_FUNCTION_BASE, -1 * forward_local_slack / criticality_denom);
2309  tnode[to_node].forward_weight += discount * tnode[inode].forward_weight;
2310  }
2311  }
2312  }
2313 
2314  /* Do a backward topological traversal to populate backward weights.
2315  Since the sinks are all on different levels, we have to check for them as we go. */
2316  for (ilevel = num_tnode_levels - 1; ilevel >= 0; ilevel--) {
2317  num_at_level = tnodes_at_level[ilevel].nelem;
2318 
2319  for (i = 0; i < num_at_level; i++) {
2320  inode = tnodes_at_level[ilevel].list[i];
2321  if (!(has_valid_T_arr(inode) && has_valid_T_req(inode))) {
2322  continue;
2323  }
2324  num_edges = tnode[inode].num_edges;
2325  if (num_edges == 0) { /* sink */
2326  tnode[inode].backward_weight = 1.;
2327  } else {
2328  tedge = tnode[inode].out_edges;
2329  for (iedge = 0; iedge < num_edges; iedge++) {
2330  to_node = tedge[iedge].to_node;
2331  if (!(has_valid_T_arr(to_node) && has_valid_T_req(to_node))) {
2332  continue;
2333  }
2334  backward_local_slack = tnode[to_node].T_req - tnode[inode].T_req - tedge[iedge].Tdel;
2335  discount = pow((float) DISCOUNT_FUNCTION_BASE, -1 * backward_local_slack / criticality_denom);
2336  tnode[inode].backward_weight += discount * tnode[to_node].backward_weight;
2337  }
2338  }
2339  }
2340  }
2341 }
2342 #endif
2343 
2344 static void update_slacks(t_slack * slacks, int source_clock_domain, int sink_clock_domain, float criticality_denom,
2345  boolean update_slack) {
2346 
2347  /* Updates the slack and criticality of each sink pin, or equivalently
2348  each edge, of each net. Go through the list of nets. If the net's
2349  driver tnode has been used, go through each tedge of that tnode and
2350  take the minimum of the slack/criticality for this traversal and the
2351  existing values. Since the criticality_denom can vary greatly between
2352  traversals, we have to update slack and criticality separately.
2353  Only update slack if we need to print it later (update_slack == TRUE).
2354 
2355  Note: there is a correspondence in indexing between out_edges and the
2356  net data structure: out_edges[iedge] = net[inet].node_block[iedge + 1]
2357  There is an offset of 1 because net[inet].node_block includes the driver
2358  node at index 0, while out_edges is part of the driver node and does
2359  not bother to refer to itself. */
2360 
2361  int inet, iedge, inode, to_node, num_edges;
2362  t_tedge *tedge;
2363  float T_arr, Tdel, T_req, slk, timing_criticality;
2364 
2365  for (inet = 0; inet < num_timing_nets; inet++) {
2366  inode = f_net_to_driver_tnode[inet];
2367  T_arr = tnode[inode].T_arr;
2368 
2369  if (!(has_valid_T_arr(inode) && has_valid_T_req(inode))) {
2370  continue; /* Only update this net on this traversal if its
2371  driver node has been updated on this traversal. */
2372  }
2373 
2374  num_edges = tnode[inode].num_edges;
2375  tedge = tnode[inode].out_edges;
2376 
2377  for (iedge = 0; iedge < num_edges; iedge++) {
2378  to_node = tedge[iedge].to_node;
2379  if (!(has_valid_T_arr(to_node) && has_valid_T_req(to_node))) {
2380  continue; /* Only update this edge on this traversal if this
2381  particular sink node has been updated on this traversal. */
2382  }
2383  Tdel = tedge[iedge].Tdel;
2384  T_req = tnode[to_node].T_req;
2385 
2386  if (update_slack) {
2387  /* Update the slack for this edge. */
2388  slk = T_req - T_arr - Tdel;
2389  if (slk < slacks->slack[inet][iedge + 1]) {
2390  /* Only update on this traversal if this edge would have
2391  lower slack from this traversal than its current value. */
2392  slacks->slack[inet][iedge + 1] = slk;
2393  }
2394  }
2395 
2396 #if SLACK_DEFINITION == 'R'
2397  /* Since criticality_denom is not the same on each traversal,
2398  we have to update criticality separately. */
2399  timing_criticality = 1 - (T_req - T_arr - Tdel)/criticality_denom;
2400  if (timing_criticality > slacks->timing_criticality[inet][iedge + 1]) {
2401  slacks->timing_criticality[inet][iedge + 1] = timing_criticality;
2402  }
2403  #ifdef PATH_COUNTING
2404  /* Also update path criticality separately. Kong uses slack / T_crit for the exponent,
2405  which is equivalent to criticality - 1. Depending on how PATH_COUNTING is defined,
2406  different functional forms are used. */
2407  #if PATH_COUNTING == 'S' /* Use sum of forward and backward weights. */
2408  slacks->path_criticality[inet][iedge + 1] = max(slacks->path_criticality[inet][iedge + 1],
2409  (tnode[inode].forward_weight + tnode[to_node].backward_weight) *
2410  pow((float) FINAL_DISCOUNT_FUNCTION_BASE, timing_criticality - 1));
2411  #elif PATH_COUNTING == 'P' /* Use product of forward and backward weights. */
2412  slacks->path_criticality[inet][iedge + 1] = max(slacks->path_criticality[inet][iedge + 1],
2413  tnode[inode].forward_weight * tnode[to_node].backward_weight *
2414  pow((float) FINAL_DISCOUNT_FUNCTION_BASE, timing_criticality - 1));
2415  #elif PATH_COUNTING == 'L' /* Use natural log of product of forward and backward weights. */
2416  slacks->path_criticality[inet][iedge + 1] = max(slacks->path_criticality[inet][iedge + 1],
2417  log(tnode[inode].forward_weight * tnode[to_node].backward_weight) *
2418  pow((float) FINAL_DISCOUNT_FUNCTION_BASE, timing_criticality - 1));
2419  #elif PATH_COUNTING == 'R' /* Use product of natural logs of forward and backward weights. */
2420  slacks->path_criticality[inet][iedge + 1] = max(slacks->path_criticality[inet][iedge + 1],
2421  log(tnode[inode].forward_weight) * log(tnode[to_node].backward_weight) *
2422  pow((float) FINAL_DISCOUNT_FUNCTION_BASE, timing_criticality - 1));
2423  #endif
2424  #endif
2425 #endif
2426  }
2427  }
2428 }
2429 
2430 void print_lut_remapping(const char *fname) {
2431  FILE *fp;
2432  int inode, i;
2433  t_pb *pb;
2434 
2435 
2436  fp = my_fopen(fname, "w", 0);
2437  fprintf(fp, "# LUT_Name\tinput_pin_mapping\n");
2438 
2439  for (inode = 0; inode < num_tnodes; inode++) {
2440  /* Print LUT input rebalancing */
2441  if (tnode[inode].type == TN_PRIMITIVE_OPIN && tnode[inode].pb_graph_pin != NULL) {
2442  pb = block[tnode[inode].block].pb->rr_node_to_pb_mapping[tnode[inode].pb_graph_pin->pin_count_in_cluster];
2443  if (pb != NULL && pb->lut_pin_remap != NULL) {
2444  assert(pb->pb_graph_node->pb_type->num_output_pins == 1 && pb->pb_graph_node->pb_type->num_clock_pins == 0); /* ensure LUT properties are valid */
2445  assert(pb->pb_graph_node->num_input_ports == 1);
2446  fprintf(fp, "%s", pb->name);
2447  for (i = 0; i < pb->pb_graph_node->num_input_pins[0]; i++) {
2448  fprintf(fp, "\t%d", pb->lut_pin_remap[i]);
2449  }
2450  fprintf(fp, "\n");
2451  }
2452  }
2453  }
2454 
2455  fclose(fp);
2456 }
2457 
2458 void print_critical_path(const char *fname) {
2459 
2460  /* Prints the critical path to a file. */
2461 
2462  t_linked_int *critical_path_head, *critical_path_node;
2463  FILE *fp;
2464  int non_global_nets_on_crit_path, global_nets_on_crit_path;
2465  int tnodes_on_crit_path, inode, iblk, inet;
2466  e_tnode_type type;
2467  float total_net_delay, total_logic_delay, Tdel;
2468 
2469  critical_path_head = allocate_and_load_critical_path();
2470  critical_path_node = critical_path_head;
2471 
2472  fp = my_fopen(fname, "w", 0);
2473 
2474  non_global_nets_on_crit_path = 0;
2475  global_nets_on_crit_path = 0;
2476  tnodes_on_crit_path = 0;
2477  total_net_delay = 0.;
2478  total_logic_delay = 0.;
2479 
2480  while (critical_path_node != NULL) {
2481  Tdel = print_critical_path_node(fp, critical_path_node);
2482  inode = critical_path_node->data;
2483  type = tnode[inode].type;
2484  tnodes_on_crit_path++;
2485 
2486  if (type == TN_CB_OPIN) {
2487  get_tnode_block_and_output_net(inode, &iblk, &inet);
2488 
2489  if (!timing_nets[inet].is_global)
2490  non_global_nets_on_crit_path++;
2491  else
2492  global_nets_on_crit_path++;
2493 
2494  total_net_delay += Tdel;
2495  } else {
2496  total_logic_delay += Tdel;
2497  }
2498 
2499  critical_path_node = critical_path_node->next;
2500  }
2501 
2502  fprintf(fp, "\nTnodes on critical path: %d Non-global nets on critical path: %d."
2503  "\n", tnodes_on_crit_path, non_global_nets_on_crit_path);
2504  fprintf(fp, "Global nets on critical path: %d.\n", global_nets_on_crit_path);
2505  fprintf(fp, "Total logic delay: %g (s) Total net delay: %g (s)\n",
2506  total_logic_delay, total_net_delay);
2507 
2508  vpr_printf(TIO_MESSAGE_INFO, "Nets on critical path: %d normal, %d global.\n",
2509  non_global_nets_on_crit_path, global_nets_on_crit_path);
2510 
2511  vpr_printf(TIO_MESSAGE_INFO, "Total logic delay: %g (s), total net delay: %g (s)\n",
2512  total_logic_delay, total_net_delay);
2513 
2514  /* Make sure total_logic_delay and total_net_delay add
2515  up to critical path delay,within 5 decimal places. */
2516  assert(total_logic_delay + total_net_delay - get_critical_path_delay()/1e9 < 1e-5);
2517 
2518  fclose(fp);
2519  free_int_list(&critical_path_head);
2520 }
2521 
2523 
2524  /* Finds the critical path and returns a list of the tnodes on the critical *
2525  * path in a linked list, from the path SOURCE to the path SINK. */
2526 
2527  t_linked_int *critical_path_head, *curr_crit_node, *prev_crit_node;
2528  int inode, iedge, to_node, num_at_level_zero, i, j, crit_node = OPEN, num_edges;
2529  int source_clock_domain = UNDEFINED, sink_clock_domain = UNDEFINED;
2530  float min_slack = HUGE_POSITIVE_FLOAT, slack;
2531  t_tedge *tedge;
2532 
2533  /* If there's only one clock, we can use the arrival and required times
2534  currently on the timing graph to find the critical path. If there are multiple
2535  clocks, however, the values currently on the timing graph correspond to the
2536  last constraint (pair of clock domains) analysed, which may not be the constraint
2537  with the critical path. In this case, we have to find the constraint with the
2538  least slack and redo the timing analysis for this constraint so we get the right
2539  values onto the timing graph. */
2540 
2541  if (g_sdc->num_constrained_clocks > 1) {
2542  /* The critical path belongs to the source and sink clock domains
2543  with the least slack. Find these clock domains now. */
2544 
2545  for (i = 0; i < g_sdc->num_constrained_clocks; i++) {
2546  for (j = 0; j < g_sdc->num_constrained_clocks; j++) {
2547  if (min_slack > f_timing_stats->least_slack[i][j]) {
2548  min_slack = f_timing_stats->least_slack[i][j];
2549  source_clock_domain = i;
2550  sink_clock_domain = j;
2551  }
2552  }
2553  }
2554 
2555  /* Do a timing analysis for this clock domain pair only.
2556  Set is_prepacked to FALSE since we don't care about the clusterer's normalized values.
2557  Set is_final_analysis to FALSE to get actual, rather than relaxed, slacks.
2558  Set max critical input/output paths to NULL since they aren't used unless is_prepacked is TRUE. */
2559  do_timing_analysis_for_constraint(source_clock_domain, sink_clock_domain, FALSE, FALSE, (long*)NULL, (long*)NULL);
2560  }
2561 
2562  /* Start at the source (level-0) tnode with the least slack (T_req-T_arr).
2563  This will form the head of our linked list of tnodes on the critical path. */
2564  min_slack = HUGE_POSITIVE_FLOAT;
2565  num_at_level_zero = tnodes_at_level[0].nelem;
2566  for (i = 0; i < num_at_level_zero; i++) {
2567  inode = tnodes_at_level[0].list[i];
2568  if (has_valid_T_arr(inode) && has_valid_T_req(inode)) { /* Valid arrival and required times */
2569  slack = tnode[inode].T_req - tnode[inode].T_arr;
2570  if (slack < min_slack) {
2571  crit_node = inode;
2572  min_slack = slack;
2573  }
2574  }
2575  }
2576  critical_path_head = (t_linked_int *) my_malloc(sizeof(t_linked_int));
2577  critical_path_head->data = crit_node;
2578  assert(crit_node != OPEN);
2579  prev_crit_node = critical_path_head;
2580  num_edges = tnode[crit_node].num_edges;
2581 
2582  /* Keep adding the tnode in this tnode's fanout which has the least slack
2583  to our critical path linked list, then jump to that tnode and repeat, until
2584  we hit a tnode with no edges, which is the sink of the critical path. */
2585  while (num_edges != 0) {
2586  curr_crit_node = (t_linked_int *) my_malloc(sizeof(t_linked_int));
2587  prev_crit_node->next = curr_crit_node;
2588  tedge = tnode[crit_node].out_edges;
2589  min_slack = HUGE_POSITIVE_FLOAT;
2590 
2591  for (iedge = 0; iedge < num_edges; iedge++) {
2592  to_node = tedge[iedge].to_node;
2593  if (has_valid_T_arr(to_node) && has_valid_T_req(to_node)) { /* Valid arrival and required times */
2594  slack = tnode[to_node].T_req - tnode[to_node].T_arr;
2595  if (slack < min_slack) {
2596  crit_node = to_node;
2597  min_slack = slack;
2598  }
2599  }
2600  }
2601 
2602  curr_crit_node->data = crit_node;
2603  prev_crit_node = curr_crit_node;
2604  num_edges = tnode[crit_node].num_edges;
2605  }
2606 
2607  prev_crit_node->next = NULL;
2608  return (critical_path_head);
2609 }
2610 
2611 void get_tnode_block_and_output_net(int inode, int *iblk_ptr, int *inet_ptr) {
2612 
2613  /* Returns the index of the block that this tnode is part of. If the tnode *
2614  * is a TN_CB_OPIN or TN_INPAD_OPIN (i.e. if it drives a net), the net index is *
2615  * returned via inet_ptr. Otherwise inet_ptr points at OPEN. */
2616 
2617  int inet, ipin, iblk;
2618  e_tnode_type tnode_type;
2619 
2620  iblk = tnode[inode].block;
2621  tnode_type = tnode[inode].type;
2622 
2623  if (tnode_type == TN_CB_OPIN) {
2624  ipin = tnode[inode].pb_graph_pin->pin_count_in_cluster;
2625  inet = block[iblk].pb->rr_graph[ipin].net_num;
2626  assert(inet != OPEN);
2627  inet = vpack_to_clb_net_mapping[inet];
2628  } else {
2629  inet = OPEN;
2630  }
2631 
2632  *iblk_ptr = iblk;
2633  *inet_ptr = inet;
2634 }
2635 
2637  float constant_net_delay_value) {
2638 
2639  /* Does a timing analysis (simple) where it assumes that each net has a *
2640  * constant delay value. Used only when operation == TIMING_ANALYSIS_ONLY. */
2641 
2642  /*struct s_linked_vptr *net_delay_chunk_list_head;*/
2643 
2644  t_chunk net_delay_ch = {NULL, 0, NULL};
2645  t_slack * slacks = NULL;
2646  float **net_delay = NULL;
2647 
2648  slacks = alloc_and_load_timing_graph(timing_inf);
2649  net_delay = alloc_net_delay(&net_delay_ch, timing_nets,
2650  num_timing_nets);
2651 
2652  load_constant_net_delay(net_delay, constant_net_delay_value, timing_nets,
2653  num_timing_nets);
2654  load_timing_graph_net_delays(net_delay);
2655 
2656  do_timing_analysis(slacks, FALSE, FALSE, TRUE);
2657 
2658  if (getEchoEnabled()) {
2660  print_critical_path("critical_path.echo");
2669  }
2670 
2672 
2673  free_timing_graph(slacks);
2674  free_net_delay(net_delay, &net_delay_ch);
2675 }
2676 #ifndef PATH_COUNTING
2677 static void update_normalized_costs(float criticality_denom, long max_critical_input_paths,
2678  long max_critical_output_paths) {
2679  int inode;
2680 
2681  /* Update the normalized costs for the clusterer. On each traversal, each cost is
2682  updated for tnodes analysed on this traversal if it would give this tnode a higher
2683  criticality when calculating block criticality for the clusterer. */
2684 
2685  assert(criticality_denom != 0); /* Possible if timing analysis is being run pre-packing
2686  with all delays set to 0. This is not currently done,
2687  but if you're going to do it, you need to decide how
2688  best to normalize these values to suit your purposes. */
2689 
2690  for (inode = 0; inode < num_tnodes; inode++) {
2691  /* Only calculate for tnodes which have valid arrival and required times. */
2692  if (has_valid_T_arr(inode) && has_valid_T_req(inode)) {
2693  tnode[inode].prepacked_data->normalized_slack = std::min(tnode[inode].prepacked_data->normalized_slack,
2694  (tnode[inode].T_req - tnode[inode].T_arr)/criticality_denom);
2695  tnode[inode].prepacked_data->normalized_T_arr = std::max(tnode[inode].prepacked_data->normalized_T_arr,
2696  tnode[inode].T_arr/criticality_denom);
2697  tnode[inode].prepacked_data->normalized_total_critical_paths = std::max(tnode[inode].prepacked_data->normalized_total_critical_paths,
2698  ((float) tnode[inode].prepacked_data->num_critical_input_paths + tnode[inode].prepacked_data->num_critical_output_paths) /
2699  ((float) max_critical_input_paths + max_critical_output_paths));
2700  }
2701  }
2702 }
2703 #endif
2704 /* Set new arrival time
2705  Special code for LUTs to enable LUT input delay balancing
2706 */
2707 static void set_and_balance_arrival_time(int to_node, int from_node, float Tdel, boolean do_lut_input_balancing) {
2708  int i, j;
2709  t_pb *pb;
2710  boolean rebalance;
2711  t_tnode *input_tnode;
2712 
2713  boolean *assigned = NULL;
2714  int fastest_unassigned_pin, most_crit_tnode, most_crit_pin;
2715  float min_delay, highest_T_arr, balanced_T_arr;
2716 
2717  /* Normal case for determining arrival time */
2718  tnode[to_node].T_arr = std::max(tnode[to_node].T_arr, tnode[from_node].T_arr + Tdel);
2719 
2720  /* Do LUT input rebalancing for LUTs */
2721  if (do_lut_input_balancing && tnode[to_node].type == TN_PRIMITIVE_OPIN && tnode[to_node].pb_graph_pin != NULL) {
2722  pb = block[tnode[to_node].block].pb->rr_node_to_pb_mapping[tnode[to_node].pb_graph_pin->pin_count_in_cluster];
2723  if (pb != NULL && pb->lut_pin_remap != NULL) {
2724  /* this is a LUT primitive, do pin swapping */
2725  assert(pb->pb_graph_node->pb_type->num_output_pins == 1 && pb->pb_graph_node->pb_type->num_clock_pins == 0); /* ensure LUT properties are valid */
2726  assert(pb->pb_graph_node->num_input_ports == 1);
2727  assert(tnode[from_node].block == tnode[to_node].block);
2728 
2729  /* assign from_node to default location */
2730  assert(pb->lut_pin_remap[tnode[from_node].pb_graph_pin->pin_number] == OPEN);
2731  pb->lut_pin_remap[tnode[from_node].pb_graph_pin->pin_number] = tnode[from_node].pb_graph_pin->pin_number;
2732 
2733  /* If all input pins are known, perform LUT input delay rebalancing, do nothing otherwise */
2734  rebalance = TRUE;
2735  for (i = 0; i < pb->pb_graph_node->num_input_pins[0]; i++) {
2736  input_tnode = block[tnode[to_node].block].pb->rr_graph[pb->pb_graph_node->input_pins[0][i].pin_count_in_cluster].tnode;
2737  if (input_tnode != NULL && pb->lut_pin_remap[i] == OPEN) {
2738  rebalance = FALSE;
2739  }
2740  }
2741  if (rebalance == TRUE) {
2742  /* Rebalance LUT inputs so that the most critical paths get the fastest inputs */
2743  balanced_T_arr = OPEN;
2744  assigned = (boolean*)my_calloc(pb->pb_graph_node->num_input_pins[0], sizeof(boolean));
2745  /* Clear pin remapping */
2746  for (i = 0; i < pb->pb_graph_node->num_input_pins[0]; i++) {
2747  pb->lut_pin_remap[i] = OPEN;
2748  }
2749  /* load new T_arr and pin mapping */
2750  for (i = 0; i < pb->pb_graph_node->num_input_pins[0]; i++) {
2751  /* Find fastest physical input pin of LUT */
2752  fastest_unassigned_pin = OPEN;
2753  min_delay = OPEN;
2754  for (j = 0; j < pb->pb_graph_node->num_input_pins[0]; j++) {
2755  if (pb->lut_pin_remap[j] == OPEN) {
2756  if (fastest_unassigned_pin == OPEN) {
2757  fastest_unassigned_pin = j;
2758  min_delay = pb->pb_graph_node->input_pins[0][j].pin_timing_del_max[0];
2759  } else if (min_delay > pb->pb_graph_node->input_pins[0][j].pin_timing_del_max[0]) {
2760  fastest_unassigned_pin = j;
2761  min_delay = pb->pb_graph_node->input_pins[0][j].pin_timing_del_max[0];
2762  }
2763  }
2764  }
2765  assert(fastest_unassigned_pin != OPEN);
2766 
2767  /* Find most critical LUT input pin in user circuit */
2768  most_crit_tnode = OPEN;
2769  highest_T_arr = OPEN;
2770  most_crit_pin = OPEN;
2771  for (j = 0; j < pb->pb_graph_node->num_input_pins[0]; j++) {
2772  input_tnode = block[tnode[to_node].block].pb->rr_graph[pb->pb_graph_node->input_pins[0][j].pin_count_in_cluster].tnode;
2773  if (input_tnode != NULL && assigned[j] == FALSE) {
2774  if (most_crit_tnode == OPEN) {
2775  most_crit_tnode = get_tnode_index(input_tnode);
2776  highest_T_arr = input_tnode->T_arr;
2777  most_crit_pin = j;
2778  } else if (highest_T_arr < input_tnode->T_arr) {
2779  most_crit_tnode = get_tnode_index(input_tnode);
2780  highest_T_arr = input_tnode->T_arr;
2781  most_crit_pin = j;
2782  }
2783  }
2784  }
2785 
2786  if (most_crit_tnode == OPEN) {
2787  break;
2788  } else {
2789  assert(tnode[most_crit_tnode].num_edges == 1);
2790  tnode[most_crit_tnode].out_edges[0].Tdel = min_delay;
2791  pb->lut_pin_remap[fastest_unassigned_pin] = most_crit_pin;
2792  assigned[most_crit_pin] = TRUE;
2793  if (balanced_T_arr < min_delay + highest_T_arr) {
2794  balanced_T_arr = min_delay + highest_T_arr;
2795  }
2796  }
2797  }
2798  free(assigned);
2799  if (balanced_T_arr != OPEN) {
2800  tnode[to_node].T_arr = balanced_T_arr;
2801  }
2802  }
2803  }
2804  }
2805 }
2806 
2807 static void load_clock_domain_and_clock_and_io_delay(boolean is_prepacked) {
2808 /* Loads clock domain and clock delay onto TN_FF_SOURCE and TN_FF_SINK tnodes.
2809 The clock domain of each clock is its index in g_sdc->constrained_clocks.
2810 We do this by matching each clock input pad to a constrained clock name, then
2811 propagating forward its domain index to all flip-flops fed by it (TN_FF_CLOCK
2812 tnodes), then later looking up the TN_FF_CLOCK tnode corresponding to every
2813 TN_FF_SOURCE and TN_FF_SINK tnode. We also add up the delays along the clock net
2814 to each TN_FF_CLOCK tnode to give it (and the SOURCE/SINK nodes) a clock delay.
2815 
2816 Also loads input delay/output delay (from set_input_delay or set_output_delay SDC
2817 constraints) onto the tedges between TN_INPAD_SOURCE/OPIN and TN_OUTPAD_IPIN/SINK
2818 tnodes. Finds fanout of each clock domain, including virtual (external) clocks.
2819 Marks unconstrained I/Os with a dummy clock domain (-1). */
2820 
2821  int i, iclock, inode, num_at_level, clock_index, input_index, output_index;
2822  char * net_name;
2823  t_tnode * clock_node;
2824 
2825  /* Wipe fanout of each clock domain in g_sdc->constrained_clocks. */
2826  for (iclock = 0; iclock < g_sdc->num_constrained_clocks; iclock++) {
2827  g_sdc->constrained_clocks[iclock].fanout = 0;
2828  }
2829 
2830  /* First, visit all TN_INPAD_SOURCE tnodes */
2831  num_at_level = tnodes_at_level[0].nelem; /* There are num_at_level top-level tnodes. */
2832  for (i = 0; i < num_at_level; i++) {
2833  inode = tnodes_at_level[0].list[i]; /* Iterate through each tnode. inode is the index of the tnode in the array tnode. */
2834  if (tnode[inode].type == TN_INPAD_SOURCE) { /* See if this node is the start of an I/O pad (as oppposed to a flip-flop source). */
2835  net_name = find_tnode_net_name(inode, is_prepacked);
2836  if ((clock_index = find_clock(net_name)) != -1) { /* We have a clock inpad. */
2837  /* Set clock skew to 0 at the source and propagate skew
2838  recursively to all connected nodes, adding delay as we go.
2839  Set the clock domain to the index of the clock in the
2840  g_sdc->constrained_clocks array and propagate unchanged. */
2841  tnode[inode].clock_delay = 0.;
2842  tnode[inode].clock_domain = clock_index;
2844 
2845  /* Set the clock domain of this clock inpad to -1, so that we do not analyse it.
2846  If we did not do this, the clock net would be analysed on the same iteration of
2847  the timing analyzer as the flip-flops it drives! */
2848  tnode[inode].clock_domain = -1;
2849 
2850  } else if ((input_index = find_input(net_name)) != -1) {
2851  /* We have a constrained non-clock inpad - find the clock it's constrained on. */
2852  clock_index = find_clock(g_sdc->constrained_inputs[input_index].clock_name);
2853  assert(clock_index != -1);
2854  /* The clock domain for this input is that of its virtual clock */
2855  tnode[inode].clock_domain = clock_index;
2856  /* Increment the fanout of this virtual clock domain. */
2857  g_sdc->constrained_clocks[clock_index].fanout++;
2858  /* Mark input delay specified in SDC file on the timing graph edge leading out from the TN_INPAD_SOURCE node. */
2859  tnode[inode].out_edges[0].Tdel = g_sdc->constrained_inputs[input_index].delay * 1e-9; /* normalize to be in seconds not ns */
2860  } else { /* We have an unconstrained input - mark with dummy clock domain and do not analyze. */
2861  tnode[inode].clock_domain = -1;
2862  }
2863  }
2864  }
2865 
2866  /* Second, visit all TN_OUTPAD_SINK tnodes. Unlike for TN_INPAD_SOURCE tnodes,
2867  we have to search the entire tnode array since these are all on different levels. */
2868  for (inode = 0; inode < num_tnodes; inode++) {
2869  if (tnode[inode].type == TN_OUTPAD_SINK) {
2870  /* Since the pb_graph_pin of TN_OUTPAD_SINK tnodes points to NULL, we have to find the net
2871  from the pb_graph_pin of the corresponding TN_OUTPAD_IPIN node.
2872  Exploit the fact that the TN_OUTPAD_IPIN node will always be one prior in the tnode array. */
2873  assert(tnode[inode - 1].type == TN_OUTPAD_IPIN);
2874  net_name = find_tnode_net_name(inode, is_prepacked);
2875  output_index = find_output(net_name + 4); /* the + 4 removes the prefix "out:" automatically prepended to outputs */
2876  if (output_index != -1) {
2877  /* We have a constrained outpad, find the clock it's constrained on. */
2878  clock_index = find_clock(g_sdc->constrained_outputs[output_index].clock_name);
2879  assert(clock_index != -1);
2880  /* The clock doain for this output is that of its virtual clock */
2881  tnode[inode].clock_domain = clock_index;
2882  /* Increment the fanout of this virtual clock domain. */
2883  g_sdc->constrained_clocks[clock_index].fanout++;
2884  /* Mark output delay specified in SDC file on the timing graph edge leading into the TN_OUTPAD_SINK node.
2885  However, this edge is part of the corresponding TN_OUTPAD_IPIN node.
2886  Exploit the fact that the TN_OUTPAD_IPIN node will always be one prior in the tnode array. */
2887  tnode[inode - 1].out_edges[0].Tdel = g_sdc->constrained_outputs[output_index].delay * 1e-9; /* normalize to be in seconds not ns */
2888 
2889  } else { /* We have an unconstrained input - mark with dummy clock domain and do not analyze. */
2890  tnode[inode].clock_domain = -1;
2891  }
2892  }
2893  }
2894 
2895  /* Third, visit all TN_FF_SOURCE and TN_FF_SINK tnodes, and transfer the clock domain and skew from their corresponding TN_FF_CLOCK tnodes*/
2896  for (inode = 0; inode < num_tnodes; inode++) {
2897  if (tnode[inode].type == TN_FF_SOURCE || tnode[inode].type == TN_FF_SINK) {
2898  clock_node = find_ff_clock_tnode(inode, is_prepacked);
2899  tnode[inode].clock_domain = clock_node->clock_domain;
2900  tnode[inode].clock_delay = clock_node->clock_delay;
2901  }
2902  }
2903 }
2904 
2905 static void propagate_clock_domain_and_skew(int inode) {
2906 /* Given a tnode indexed by inode (which is part of a clock net),
2907  * propagate forward the clock domain (unchanged) and skew (adding the delay of edges) to all nodes in its fanout.
2908  * We then call recursively on all children in a depth-first search. If num_edges is 0, we should be at an TN_FF_CLOCK tnode; we then set the
2909  * TN_FF_SOURCE and TN_FF_SINK nodes to have the same clock domain and skew as the TN_FF_CLOCK node. We implicitly rely on a tnode not being
2910  * part of two separate clock nets, since undefined behaviour would result if one DFS overwrote the results of another. This may
2911  * be problematic in cases of multiplexed or locally-generated clocks. */
2912 
2913  int iedge, to_node;
2914  t_tedge * tedge;
2915 
2916  tedge = tnode[inode].out_edges; /* Get the list of edges from the node we're visiting. */
2917 
2918  if (!tedge) { /* Leaf/sink node; base case of the recursion. */
2919  assert(tnode[inode].type == TN_FF_CLOCK);
2920  assert(tnode[inode].clock_domain != -1); /* Make sure clock domain is valid. */
2921  g_sdc->constrained_clocks[tnode[inode].clock_domain].fanout++;
2922  return;
2923  }
2924 
2925  for (iedge = 0; iedge < tnode[inode].num_edges; iedge++) { /* Go through each edge coming out from this tnode */
2926  to_node = tedge[iedge].to_node;
2927  /* Propagate clock skew forward along this clock net, adding the delay of the wires (edges) of the clock network. */
2928  tnode[to_node].clock_delay = tnode[inode].clock_delay + tedge[iedge].Tdel;
2929  /* Propagate clock domain forward unchanged */
2930  tnode[to_node].clock_domain = tnode[inode].clock_domain;
2931  /* Finally, call recursively on the destination tnode. */
2933  }
2934 }
2935 
2936 static char * find_tnode_net_name(int inode, boolean is_prepacked) {
2937  /* Finds the name of the net which a tnode (inode) is on (different for pre-/post-packed netlists). */
2938 
2939  int logic_block; /* Name chosen not to conflict with the array logical_block */
2940  char * net_name;
2941  t_pb * physical_block;
2942  t_pb_graph_pin * pb_graph_pin;
2943 
2944  logic_block = tnode[inode].block;
2945  if (is_prepacked) {
2946  net_name = logical_block[logic_block].name;
2947  } else {
2948  physical_block = block[logic_block].pb;
2949  pb_graph_pin = tnode[inode].pb_graph_pin;
2950  net_name = physical_block->rr_node_to_pb_mapping[pb_graph_pin->pin_count_in_cluster]->name;
2951  }
2952  return net_name;
2953 }
2954 
2955 static t_tnode * find_ff_clock_tnode(int inode, boolean is_prepacked) {
2956  /* Finds the TN_FF_CLOCK tnode on the same flipflop as an TN_FF_SOURCE or TN_FF_SINK tnode. */
2957 
2958  int logic_block; /* Name chosen not to conflict with the array logical_block */
2959  t_tnode * ff_clock_tnode;
2960  t_rr_node * rr_graph;
2961  t_pb_graph_node * parent_pb_graph_node;
2962  t_pb_graph_pin * ff_source_or_sink_pb_graph_pin, * clock_pb_graph_pin;
2963 
2964  logic_block = tnode[inode].block;
2965  if (is_prepacked) {
2966  ff_clock_tnode = logical_block[logic_block].clock_net_tnode;
2967  } else {
2968  rr_graph = block[logic_block].pb->rr_graph;
2969  ff_source_or_sink_pb_graph_pin = tnode[inode].pb_graph_pin;
2970  parent_pb_graph_node = ff_source_or_sink_pb_graph_pin->parent_node;
2971  /* Make sure there's only one clock port and only one clock pin in that port */
2972  assert(parent_pb_graph_node->num_clock_ports == 1);
2973  assert(parent_pb_graph_node->num_clock_pins[0] == 1);
2974  clock_pb_graph_pin = &parent_pb_graph_node->clock_pins[0][0];
2975  ff_clock_tnode = rr_graph[clock_pb_graph_pin->pin_count_in_cluster].tnode;
2976  }
2977  assert(ff_clock_tnode != NULL);
2978  assert(ff_clock_tnode->type == TN_FF_CLOCK);
2979  return ff_clock_tnode;
2980 }
2981 
2982 static int find_clock(char * net_name) {
2983 /* Given a string net_name, find whether it's the name of a clock in the array g_sdc->constrained_clocks. *
2984  * if it is, return the clock's index in g_sdc->constrained_clocks; if it's not, return -1. */
2985  int index;
2986  for (index = 0; index < g_sdc->num_constrained_clocks; index++) {
2987  if (strcmp(net_name, g_sdc->constrained_clocks[index].name) == 0) {
2988  return index;
2989  }
2990  }
2991  return -1;
2992 }
2993 
2994 static int find_input(char * net_name) {
2995 /* Given a string net_name, find whether it's the name of a constrained input in the array g_sdc->constrained_inputs. *
2996  * if it is, return its index in g_sdc->constrained_inputs; if it's not, return -1. */
2997  int index;
2998  for (index = 0; index < g_sdc->num_constrained_inputs; index++) {
2999  if (strcmp(net_name, g_sdc->constrained_inputs[index].name) == 0) {
3000  return index;
3001  }
3002  }
3003  return -1;
3004 }
3005 
3006 static int find_output(char * net_name) {
3007 /* Given a string net_name, find whether it's the name of a constrained output in the array g_sdc->constrained_outputs. *
3008  * if it is, return its index in g_sdc->constrained_outputs; if it's not, return -1. */
3009  int index;
3010  for (index = 0; index < g_sdc->num_constrained_outputs; index++) {
3011  if (strcmp(net_name, g_sdc->constrained_outputs[index].name) == 0) {
3012  return index;
3013  }
3014  }
3015  return -1;
3016 }
3017 
3018 static int find_cf_constraint(char * source_clock_name, char * sink_ff_name) {
3019  /* Given a source clock domain and a sink flip-flop, find out if there's an override constraint between them.
3020  If there is, return the index in g_sdc->cf_constraints; if there is not, return -1. */
3021  int icf, isource, isink;
3022 
3023  for (icf = 0; icf < g_sdc->num_cf_constraints; icf++) {
3024  for (isource = 0; isource < g_sdc->cf_constraints[icf].num_source; isource++) {
3025  if (strcmp(g_sdc->cf_constraints[icf].source_list[isource], source_clock_name) == 0) {
3026  for (isink = 0; isink < g_sdc->cf_constraints[icf].num_sink; isink++) {
3027  if (strcmp(g_sdc->cf_constraints[icf].sink_list[isink], sink_ff_name) == 0) {
3028  return icf;
3029  }
3030  }
3031  }
3032  }
3033  }
3034  return -1;
3035 }
3036 
3037 static inline int get_tnode_index(t_tnode * node) {
3038  /* Returns the index of pointer_to_tnode in the array tnode [0..num_tnodes - 1]
3039  using pointer arithmetic. */
3040  return node - tnode;
3041 }
3042 
3043 static inline boolean has_valid_T_arr(int inode) {
3044  /* Has this tnode's arrival time been changed from its original value of HUGE_NEGATIVE_FLOAT? */
3045  return (boolean) (tnode[inode].T_arr > HUGE_NEGATIVE_FLOAT + 1);
3046 }
3047 
3048 static inline boolean has_valid_T_req(int inode) {
3049  /* Has this tnode's required time been changed from its original value of HUGE_POSITIVE_FLOAT? */
3050  return (boolean) (tnode[inode].T_req < HUGE_POSITIVE_FLOAT - 1);
3051 }
3052 
3053 #ifndef PATH_COUNTING
3054 boolean has_valid_normalized_T_arr(int inode) {
3055  /* Has this tnode's normalized_T_arr been changed from its original value of HUGE_NEGATIVE_FLOAT? */
3056  return (boolean) (tnode[inode].prepacked_data->normalized_T_arr > HUGE_NEGATIVE_FLOAT + 1);
3057 }
3058 #endif
3059 
3061  /* Finds the critical path delay, which is the minimum clock period required to meet the constraint
3062  corresponding to the pair of source and sink clock domains with the least slack in the design. */
3063 
3064  int source_clock_domain, sink_clock_domain;
3065  float least_slack_in_design = HUGE_POSITIVE_FLOAT, critical_path_delay = UNDEFINED;
3066 
3067  if (!g_sdc) return UNDEFINED; /* If timing analysis is off, for instance. */
3068 
3069  for (source_clock_domain = 0; source_clock_domain < g_sdc->num_constrained_clocks; source_clock_domain++) {
3070  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
3071  if (least_slack_in_design > f_timing_stats->least_slack[source_clock_domain][sink_clock_domain]) {
3072  least_slack_in_design = f_timing_stats->least_slack[source_clock_domain][sink_clock_domain];
3073  critical_path_delay = f_timing_stats->cpd[source_clock_domain][sink_clock_domain];
3074  }
3075  }
3076  }
3077 
3078  return critical_path_delay * 1e9; /* Convert to nanoseconds */
3079 }
3080 
3082 
3083  /* Prints critical path delay/fmax, least slack in design, and, for multiple-clock designs,
3084  minimum required clock period to meet each constraint, least slack per constraint,
3085  geometric average clock frequency, and fanout-weighted geometric average clock frequency. */
3086 
3087  int source_clock_domain, sink_clock_domain, clock_domain, fanout, total_fanout = 0,
3088  num_netlist_clocks_with_intra_domain_paths = 0;
3089  float geomean_period = 1., least_slack_in_design = HUGE_POSITIVE_FLOAT, critical_path_delay = UNDEFINED;
3090  double fanout_weighted_geomean_period = 1.;
3091  boolean found;
3092 
3093  /* Find critical path delay. If the pb_max_internal_delay is greater than this, it becomes
3094  the limiting factor on critical path delay, so print that instead, with a special message. */
3095 
3096  for (source_clock_domain = 0; source_clock_domain < g_sdc->num_constrained_clocks; source_clock_domain++) {
3097  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
3098  if (least_slack_in_design > f_timing_stats->least_slack[source_clock_domain][sink_clock_domain]) {
3099  least_slack_in_design = f_timing_stats->least_slack[source_clock_domain][sink_clock_domain];
3100  critical_path_delay = f_timing_stats->cpd[source_clock_domain][sink_clock_domain];
3101  }
3102  }
3103  }
3104 
3105  if (pb_max_internal_delay != UNDEFINED && pb_max_internal_delay > critical_path_delay) {
3106  critical_path_delay = pb_max_internal_delay;
3107  vpr_printf(TIO_MESSAGE_INFO, "Final critical path: %g ns\n", 1e9 * critical_path_delay);
3108  vpr_printf(TIO_MESSAGE_INFO, "\t(capped by fmax of block type %s)\n", pbtype_max_internal_delay->name);
3109 
3110  } else {
3111  vpr_printf(TIO_MESSAGE_INFO, "Final critical path: %g ns\n", 1e9 * critical_path_delay);
3112  }
3113 
3114  if (g_sdc->num_constrained_clocks <= 1) {
3115  /* Although critical path delay is always well-defined, it doesn't make sense to talk about fmax for multi-clock circuits */
3116  vpr_printf(TIO_MESSAGE_INFO, "f_max: %g MHz\n", 1e-6 / critical_path_delay);
3117  }
3118 
3119  /* Also print the least slack in the design */
3120  vpr_printf(TIO_MESSAGE_INFO, "\n");
3121  vpr_printf(TIO_MESSAGE_INFO, "Least slack in design: %g ns\n", 1e9 * least_slack_in_design);
3122  vpr_printf(TIO_MESSAGE_INFO, "\n");
3123 
3124  if (g_sdc->num_constrained_clocks > 1) { /* Multiple-clock design */
3125 
3126  /* Print minimum possible clock period to meet each constraint. Convert to nanoseconds. */
3127 
3128  vpr_printf(TIO_MESSAGE_INFO, "Minimum possible clock period to meet each constraint (including skew effects):\n");
3129  for (source_clock_domain = 0; source_clock_domain < g_sdc->num_constrained_clocks; source_clock_domain++) {
3130  /* Print the intra-domain constraint if it was analysed. */
3131  if (g_sdc->domain_constraint[source_clock_domain][source_clock_domain] > NEGATIVE_EPSILON) {
3132  vpr_printf(TIO_MESSAGE_INFO, "%s to %s: %g ns (%g MHz)\n",
3133  g_sdc->constrained_clocks[source_clock_domain].name,
3134  g_sdc->constrained_clocks[source_clock_domain].name,
3135  1e9 * f_timing_stats->cpd[source_clock_domain][source_clock_domain],
3136  1e-6 / f_timing_stats->cpd[source_clock_domain][source_clock_domain]);
3137  } else {
3138  vpr_printf(TIO_MESSAGE_INFO, "%s to %s: --\n",
3139  g_sdc->constrained_clocks[source_clock_domain].name,
3140  g_sdc->constrained_clocks[source_clock_domain].name);
3141  }
3142  /* Then, print all other constraints on separate lines, indented. We re-print
3143  the source clock domain's name so there's no ambiguity when parsing. */
3144  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
3145  if (source_clock_domain == sink_clock_domain) continue; /* already done that */
3146  if (g_sdc->domain_constraint[source_clock_domain][sink_clock_domain] > NEGATIVE_EPSILON) {
3147  /* If this domain pair was analysed */
3148  vpr_printf(TIO_MESSAGE_INFO, "\t%s to %s: %g ns (%g MHz)\n",
3149  g_sdc->constrained_clocks[source_clock_domain].name,
3150  g_sdc->constrained_clocks[sink_clock_domain].name,
3151  1e9 * f_timing_stats->cpd[source_clock_domain][sink_clock_domain],
3152  1e-6 / f_timing_stats->cpd[source_clock_domain][sink_clock_domain]);
3153  } else {
3154  vpr_printf(TIO_MESSAGE_INFO, "\t%s to %s: --\n",
3155  g_sdc->constrained_clocks[source_clock_domain].name,
3156  g_sdc->constrained_clocks[sink_clock_domain].name);
3157  }
3158  }
3159  }
3160 
3161  /* Print least slack per constraint. */
3162 
3163  vpr_printf(TIO_MESSAGE_INFO, "\n");
3164  vpr_printf(TIO_MESSAGE_INFO, "Least slack per constraint:\n");
3165  for (source_clock_domain = 0; source_clock_domain < g_sdc->num_constrained_clocks; source_clock_domain++) {
3166  /* Print the intra-domain slack if valid. */
3167  if (f_timing_stats->least_slack[source_clock_domain][source_clock_domain] < HUGE_POSITIVE_FLOAT - 1) {
3168  vpr_printf(TIO_MESSAGE_INFO, "%s to %s: %g ns\n",
3169  g_sdc->constrained_clocks[source_clock_domain].name,
3170  g_sdc->constrained_clocks[source_clock_domain].name,
3171  1e9 * f_timing_stats->least_slack[source_clock_domain][source_clock_domain]);
3172  } else {
3173  vpr_printf(TIO_MESSAGE_INFO, "%s to %s: --\n",
3174  g_sdc->constrained_clocks[source_clock_domain].name,
3175  g_sdc->constrained_clocks[source_clock_domain].name);
3176  }
3177  /* Then, print all other slacks on separate lines. */
3178  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
3179  if (source_clock_domain == sink_clock_domain) continue; /* already done that */
3180  if (f_timing_stats->least_slack[source_clock_domain][sink_clock_domain] < HUGE_POSITIVE_FLOAT - 1) {
3181  /* If this domain pair was analysed and has a valid slack */
3182  vpr_printf(TIO_MESSAGE_INFO, "\t%s to %s: %g ns\n",
3183  g_sdc->constrained_clocks[source_clock_domain].name,
3184  g_sdc->constrained_clocks[sink_clock_domain].name,
3185  1e9 * f_timing_stats->least_slack[source_clock_domain][sink_clock_domain]);
3186  } else {
3187  vpr_printf(TIO_MESSAGE_INFO, "\t%s to %s: --\n",
3188  g_sdc->constrained_clocks[source_clock_domain].name,
3189  g_sdc->constrained_clocks[sink_clock_domain].name);
3190  }
3191  }
3192  }
3193 
3194  /* Calculate geometric mean f_max (fanout-weighted and unweighted) from the diagonal (intra-domain) entries of critical_path_delay,
3195  excluding domains without intra-domain paths (for which the timing constraint is DO_NOT_ANALYSE) and virtual clocks. */
3196  found = FALSE;
3197  for (clock_domain = 0; clock_domain < g_sdc->num_constrained_clocks; clock_domain++) {
3198  if (g_sdc->domain_constraint[clock_domain][clock_domain] > NEGATIVE_EPSILON && g_sdc->constrained_clocks[clock_domain].is_netlist_clock) {
3199  geomean_period *= f_timing_stats->cpd[clock_domain][clock_domain];
3200  fanout = g_sdc->constrained_clocks[clock_domain].fanout;
3201  fanout_weighted_geomean_period *= pow((double) f_timing_stats->cpd[clock_domain][clock_domain], fanout);
3202  total_fanout += fanout;
3203  num_netlist_clocks_with_intra_domain_paths++;
3204  found = TRUE;
3205  }
3206  }
3207  if (found) { /* Only print these if we found at least one clock domain with intra-domain paths. */
3208  geomean_period = pow(geomean_period, (float) 1/num_netlist_clocks_with_intra_domain_paths);
3209  fanout_weighted_geomean_period = pow(fanout_weighted_geomean_period, (double) 1/total_fanout);
3210  /* Convert to MHz */
3211  vpr_printf(TIO_MESSAGE_INFO, "\n");
3212  vpr_printf(TIO_MESSAGE_INFO, "Geometric mean intra-domain period: %g ns (%g MHz)\n",
3213  1e9 * geomean_period, 1e-6 / geomean_period);
3214  vpr_printf(TIO_MESSAGE_INFO, "Fanout-weighted geomean intra-domain period: %g ns (%g MHz)\n",
3215  1e9 * fanout_weighted_geomean_period, 1e-6 / fanout_weighted_geomean_period);
3216  }
3217 
3218  vpr_printf(TIO_MESSAGE_INFO, "\n");
3219  }
3220 }
3221 
3222 static void print_timing_constraint_info(const char *fname) {
3223  /* Prints the contents of g_sdc->domain_constraint, g_sdc->constrained_clocks, constrained_ios and g_sdc->cc_constraints to a file. */
3224 
3225  FILE * fp;
3226  int source_clock_domain, sink_clock_domain, i, j;
3227  int * clock_name_length = (int *) my_malloc(g_sdc->num_constrained_clocks * sizeof(int)); /* Array of clock name lengths */
3228  int max_clock_name_length = INT_MIN;
3229  char * clock_name;
3230 
3231  fp = my_fopen(fname, "w", 0);
3232 
3233  /* Get lengths of each clock name and max length so we can space the columns properly. */
3234  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
3235  clock_name = g_sdc->constrained_clocks[sink_clock_domain].name;
3236  clock_name_length[sink_clock_domain] = strlen(clock_name);
3237  if (clock_name_length[sink_clock_domain] > max_clock_name_length) {
3238  max_clock_name_length = clock_name_length[sink_clock_domain];
3239  }
3240  }
3241 
3242  /* First, combine info from g_sdc->domain_constraint and g_sdc->constrained_clocks into a matrix
3243  (they're indexed the same as each other). */
3244 
3245  fprintf(fp, "Timing constraints in ns (source clock domains down left side, sink along top).\n"
3246  "A value of -1.00 means the pair of source and sink domains will not be analysed.\n\n");
3247 
3248  print_spaces(fp, max_clock_name_length + 4);
3249 
3250  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
3251  fprintf(fp, "%s", g_sdc->constrained_clocks[sink_clock_domain].name);
3252  /* Minimum column width of 8 */
3253  print_spaces(fp, std::max(8 - clock_name_length[sink_clock_domain], 4));
3254  }
3255  fprintf(fp, "\n");
3256 
3257  for (source_clock_domain = 0; source_clock_domain < g_sdc->num_constrained_clocks; source_clock_domain++) {
3258  fprintf(fp, "%s", g_sdc->constrained_clocks[source_clock_domain].name);
3259  print_spaces(fp, max_clock_name_length + 4 - clock_name_length[source_clock_domain]);
3260  for (sink_clock_domain = 0; sink_clock_domain < g_sdc->num_constrained_clocks; sink_clock_domain++) {
3261  fprintf(fp, "%5.2f", g_sdc->domain_constraint[source_clock_domain][sink_clock_domain]);
3262  /* Minimum column width of 8 */
3263  print_spaces(fp, std::max(clock_name_length[sink_clock_domain] - 1, 3));
3264  }
3265  fprintf(fp, "\n");
3266  }
3267 
3268  free(clock_name_length);
3269 
3270  /* Second, print I/O constraints. */
3271  fprintf(fp, "\nList of constrained inputs (note: constraining a clock net input has no effect):\n");
3272  for (i = 0; i < g_sdc->num_constrained_inputs; i++) {
3273  fprintf(fp, "Input name %s on clock %s with input delay %.2f ns\n",
3275  }
3276 
3277  fprintf(fp, "\nList of constrained outputs:\n");
3278  for (i = 0; i < g_sdc->num_constrained_outputs; i++) {
3279  fprintf(fp, "Output name %s on clock %s with output delay %.2f ns\n",
3281  }
3282 
3283  /* Third, print override constraints. */
3284  fprintf(fp, "\nList of override constraints (non-default constraints created by set_false_path, set_clock_groups, \nset_max_delay, and set_multicycle_path):\n");
3285 
3286  for (i = 0; i < g_sdc->num_cc_constraints; i++) {
3287  fprintf(fp, "Clock domain");
3288  for (j = 0; j < g_sdc->cc_constraints[i].num_source; j++) {
3289  fprintf(fp, " %s,", g_sdc->cc_constraints[i].source_list[j]);
3290  }
3291  fprintf(fp, " to clock domain");
3292  for (j = 0; j < g_sdc->cc_constraints[i].num_sink - 1; j++) {
3293  fprintf(fp, " %s,", g_sdc->cc_constraints[i].sink_list[j]);
3294  } /* We have to print the last one separately because we don't want a comma after it. */
3295  if (g_sdc->cc_constraints[i].num_multicycles == 0) { /* not a multicycle constraint */
3296  fprintf(fp, " %s: %.2f ns\n", g_sdc->cc_constraints[i].sink_list[j], g_sdc->cc_constraints[i].constraint);
3297  } else { /* multicycle constraint */
3298  fprintf(fp, " %s: %d multicycles\n", g_sdc->cc_constraints[i].sink_list[j], g_sdc->cc_constraints[i].num_multicycles);
3299  }
3300  }
3301 
3302  for (i = 0; i < g_sdc->num_cf_constraints; i++) {
3303  fprintf(fp, "Clock domain");
3304  for (j = 0; j < g_sdc->cf_constraints[i].num_source; j++) {
3305  fprintf(fp, " %s,", g_sdc->cf_constraints[i].source_list[j]);
3306  }
3307  fprintf(fp, " to flip-flop");
3308  for (j = 0; j < g_sdc->cf_constraints[i].num_sink - 1; j++) {
3309  fprintf(fp, " %s,", g_sdc->cf_constraints[i].sink_list[j]);
3310  } /* We have to print the last one separately because we don't want a comma after it. */
3311  if (g_sdc->cf_constraints[i].num_multicycles == 0) { /* not a multicycle constraint */
3312  fprintf(fp, " %s: %.2f ns\n", g_sdc->cf_constraints[i].sink_list[j], g_sdc->cf_constraints[i].constraint);
3313  } else { /* multicycle constraint */
3314  fprintf(fp, " %s: %d multicycles\n", g_sdc->cf_constraints[i].sink_list[j], g_sdc->cf_constraints[i].num_multicycles);
3315  }
3316  }
3317 
3318  for (i = 0; i < g_sdc->num_fc_constraints; i++) {
3319  fprintf(fp, "Flip-flop");
3320  for (j = 0; j < g_sdc->fc_constraints[i].num_source; j++) {
3321  fprintf(fp, " %s,", g_sdc->fc_constraints[i].source_list[j]);
3322  }
3323  fprintf(fp, " to clock domain");
3324  for (j = 0; j < g_sdc->fc_constraints[i].num_sink - 1; j++) {
3325  fprintf(fp, " %s,", g_sdc->fc_constraints[i].sink_list[j]);
3326  } /* We have to print the last one separately because we don't want a comma after it. */
3327  if (g_sdc->fc_constraints[i].num_multicycles == 0) { /* not a multicycle constraint */
3328  fprintf(fp, " %s: %.2f ns\n", g_sdc->fc_constraints[i].sink_list[j], g_sdc->fc_constraints[i].constraint);
3329  } else { /* multicycle constraint */
3330  fprintf(fp, " %s: %d multicycles\n", g_sdc->fc_constraints[i].sink_list[j], g_sdc->fc_constraints[i].num_multicycles);
3331  }
3332  }
3333 
3334  for (i = 0; i < g_sdc->num_ff_constraints; i++) {
3335  fprintf(fp, "Flip-flop");
3336  for (j = 0; j < g_sdc->ff_constraints[i].num_source; j++) {
3337  fprintf(fp, " %s,", g_sdc->ff_constraints[i].source_list[j]);
3338  }
3339  fprintf(fp, " to flip-flop");
3340  for (j = 0; j < g_sdc->ff_constraints[i].num_sink - 1; j++) {
3341  fprintf(fp, " %s,", g_sdc->ff_constraints[i].sink_list[j]);
3342  } /* We have to print the last one separately because we don't want a comma after it. */
3343  if (g_sdc->ff_constraints[i].num_multicycles == 0) { /* not a multicycle constraint */
3344  fprintf(fp, " %s: %.2f ns\n", g_sdc->ff_constraints[i].sink_list[j], g_sdc->ff_constraints[i].constraint);
3345  } else { /* multicycle constraint */
3346  fprintf(fp, " %s: %d multicycles\n", g_sdc->ff_constraints[i].sink_list[j], g_sdc->ff_constraints[i].num_multicycles);
3347  }
3348  }
3349 
3350  fclose(fp);
3351 }
3352 
3353 static void print_spaces(FILE * fp, int num_spaces) {
3354  /* Prints num_spaces spaces to file pointed to by fp. */
3355  for ( ; num_spaces > 0; num_spaces--) {
3356  fprintf(fp, " ");
3357  }
3358 }
3359 
3360 void print_timing_graph_as_blif (const char *fname, t_model *models) {
3361  struct s_model_ports *port;
3362  struct s_linked_vptr *p_io_removed;
3363  /* Prints out the critical path to a file. */
3364 
3365  FILE *fp;
3366  int i, j;
3367 
3368  fp = my_fopen(fname, "w", 0);
3369 
3370  fprintf(fp, ".model %s\n", blif_circuit_name);
3371 
3372  fprintf(fp, ".inputs ");
3373  for (i = 0; i < num_logical_blocks; i++) {
3374  if (logical_block[i].type == VPACK_INPAD) {
3375  fprintf(fp, "\\\n%s ", logical_block[i].name);
3376  }
3377  }
3378  p_io_removed = circuit_p_io_removed;
3379  while (p_io_removed) {
3380  fprintf(fp, "\\\n%s ", (char *) p_io_removed->data_vptr);
3381  p_io_removed = p_io_removed->next;
3382  }
3383 
3384  fprintf(fp, "\n");
3385 
3386  fprintf(fp, ".outputs ");
3387  for (i = 0; i < num_logical_blocks; i++) {
3388  if (logical_block[i].type == VPACK_OUTPAD) {
3389  /* Outputs have a "out:" prepended to them, must remove */
3390  fprintf(fp, "\\\n%s ", &logical_block[i].name[4]);
3391  }
3392  }
3393  fprintf(fp, "\n");
3394  fprintf(fp, ".names unconn\n");
3395  fprintf(fp, " 0\n\n");
3396 
3397  /* Print out primitives */
3398  for (i = 0; i < num_logical_blocks; i++) {
3399  print_primitive_as_blif (fp, i);
3400  }
3401 
3402  /* Print out tnode connections */
3403  for (i = 0; i < num_tnodes; i++) {
3404  if (tnode[i].type != TN_PRIMITIVE_IPIN && tnode[i].type != TN_FF_SOURCE
3405  && tnode[i].type != TN_INPAD_SOURCE
3406  && tnode[i].type != TN_OUTPAD_IPIN) {
3407  for (j = 0; j < tnode[i].num_edges; j++) {
3408  fprintf(fp, ".names tnode_%d tnode_%d\n", i,
3409  tnode[i].out_edges[j].to_node);
3410  fprintf(fp, "1 1\n\n");
3411  }
3412  }
3413  }
3414 
3415  fprintf(fp, ".end\n\n");
3416 
3417  /* Print out .subckt models */
3418  while (models) {
3419  fprintf(fp, ".model %s\n", models->name);
3420  fprintf(fp, ".inputs ");
3421  port = models->inputs;
3422  while (port) {
3423  if (port->size > 1) {
3424  for (j = 0; j < port->size; j++) {
3425  fprintf(fp, "%s[%d] ", port->name, j);
3426  }
3427  } else {
3428  fprintf(fp, "%s ", port->name);
3429  }
3430  port = port->next;
3431  }
3432  fprintf(fp, "\n");
3433  fprintf(fp, ".outputs ");
3434  port = models->outputs;
3435  while (port) {
3436  if (port->size > 1) {
3437  for (j = 0; j < port->size; j++) {
3438  fprintf(fp, "%s[%d] ", port->name, j);
3439  }
3440  } else {
3441  fprintf(fp, "%s ", port->name);
3442  }
3443  port = port->next;
3444  }
3445  fprintf(fp, "\n.blackbox\n.end\n\n");
3446  fprintf(fp, "\n\n");
3447  models = models->next;
3448  }
3449  fclose(fp);
3450 }
3451 
3452 static void print_primitive_as_blif (FILE *fpout, int iblk) {
3453  int i, j;
3454  struct s_model_ports *port;
3455  struct s_linked_vptr *truth_table;
3456  t_rr_node *irr_graph;
3457  t_pb_graph_node *pb_graph_node;
3458  int node;
3459 
3460  /* Print primitives found in timing graph in blif format based on whether this is a logical primitive or a physical primitive */
3461 
3462  if (logical_block[iblk].type == VPACK_INPAD) {
3463  if (logical_block[iblk].pb == NULL) {
3464  fprintf(fpout, ".names %s tnode_%d\n", logical_block[iblk].name,
3465  get_tnode_index(logical_block[iblk].output_net_tnodes[0][0]));
3466  } else {
3467  fprintf(fpout, ".names %s tnode_%d\n", logical_block[iblk].name,
3469  }
3470  fprintf(fpout, "1 1\n\n");
3471  } else if (logical_block[iblk].type == VPACK_OUTPAD) {
3472  /* outputs have the symbol out: automatically prepended to it, must remove */
3473  if (logical_block[iblk].pb == NULL) {
3474  fprintf(fpout, ".names tnode_%d %s\n",
3475  get_tnode_index(logical_block[iblk].input_net_tnodes[0][0]),
3476  &logical_block[iblk].name[4]);
3477  } else {
3478  /* avoid the out: from output pad naming */
3479  fprintf(fpout, ".names tnode_%d %s\n",
3481  (logical_block[iblk].name + 4));
3482  }
3483  fprintf(fpout, "1 1\n\n");
3484  } else if (strcmp(logical_block[iblk].model->name, "latch") == 0) {
3485  fprintf(fpout, ".latch ");
3486  node = OPEN;
3487 
3488  if (logical_block[iblk].pb == NULL) {
3489  i = 0;
3490  port = logical_block[iblk].model->inputs;
3491  while (port) {
3492  if (!port->is_clock) {
3493  assert(port->size == 1);
3494  for (j = 0; j < port->size; j++) {
3495  if (logical_block[iblk].input_net_tnodes[i][j] != NULL) {
3496  fprintf(fpout, "tnode_%d ",
3497  get_tnode_index(logical_block[iblk].input_net_tnodes[i][j]));
3498  } else {
3499  assert(0);
3500  }
3501  }
3502  i++;
3503  }
3504  port = port->next;
3505  }
3506  assert(i == 1);
3507 
3508  i = 0;
3509  port = logical_block[iblk].model->outputs;
3510  while (port) {
3511  assert(port->size == 1);
3512  for (j = 0; j < port->size; j++) {
3513  if (logical_block[iblk].output_net_tnodes[i][j] != NULL) {
3514  node =
3515  get_tnode_index(logical_block[iblk].output_net_tnodes[i][j]);
3516  fprintf(fpout, "latch_%s re ",
3517  logical_block[iblk].name);
3518  } else {
3519  assert(0);
3520  }
3521  }
3522  i++;
3523  port = port->next;
3524  }
3525  assert(i == 1);
3526 
3527  i = 0;
3528  port = logical_block[iblk].model->inputs;
3529  while (port) {
3530  if (port->is_clock) {
3531  for (j = 0; j < port->size; j++) {
3532  if (logical_block[iblk].clock_net_tnode != NULL) {
3533  fprintf(fpout, "tnode_%d 0\n\n",
3534  get_tnode_index(logical_block[iblk].clock_net_tnode));
3535  } else {
3536  assert(0);
3537  }
3538  }
3539  i++;
3540  }
3541  port = port->next;
3542  }
3543  assert(i == 1);
3544  } else {
3545  irr_graph = logical_block[iblk].pb->rr_graph;
3546  assert(
3547  irr_graph[logical_block[iblk].pb->pb_graph_node->input_pins[0][0].pin_count_in_cluster].net_num != OPEN);
3548  fprintf(fpout, "tnode_%d ",
3549  get_tnode_index(irr_graph[logical_block[iblk].pb->pb_graph_node->input_pins[0][0].pin_count_in_cluster].tnode));
3550  node =
3551  get_tnode_index(irr_graph[logical_block[iblk].pb->pb_graph_node->output_pins[0][0].pin_count_in_cluster].tnode);
3552  fprintf(fpout, "latch_%s re ", logical_block[iblk].name);
3553  assert(
3554  irr_graph[logical_block[iblk].pb->pb_graph_node->clock_pins[0][0].pin_count_in_cluster].net_num != OPEN);
3555  fprintf(fpout, "tnode_%d 0\n\n",
3556  get_tnode_index(irr_graph[logical_block[iblk].pb->pb_graph_node->clock_pins[0][0].pin_count_in_cluster].tnode));
3557  }
3558  assert(node != OPEN);
3559  fprintf(fpout, ".names latch_%s tnode_%d\n", logical_block[iblk].name,
3560  node);
3561  fprintf(fpout, "1 1\n\n");
3562  } else if (strcmp(logical_block[iblk].model->name, "names") == 0) {
3563  fprintf(fpout, ".names ");
3564  node = OPEN;
3565 
3566  if (logical_block[iblk].pb == NULL) {
3567  i = 0;
3568  port = logical_block[iblk].model->inputs;
3569  while (port) {
3570  assert(!port->is_clock);
3571  for (j = 0; j < port->size; j++) {
3572  if (logical_block[iblk].input_net_tnodes[i][j] != NULL) {
3573  fprintf(fpout, "tnode_%d ",
3574  get_tnode_index(logical_block[iblk].input_net_tnodes[i][j]));
3575  } else {
3576  break;
3577  }
3578  }
3579  i++;
3580  port = port->next;
3581  }
3582  assert(i == 1);
3583 
3584  i = 0;
3585  port = logical_block[iblk].model->outputs;
3586  while (port) {
3587  assert(port->size == 1);
3588  fprintf(fpout, "lut_%s\n", logical_block[iblk].name);
3589  node = get_tnode_index(logical_block[iblk].output_net_tnodes[0][0]);
3590  assert(node != OPEN);
3591  i++;
3592  port = port->next;
3593  }
3594  assert(i == 1);
3595  } else {
3596  irr_graph = logical_block[iblk].pb->rr_graph;
3597  assert(logical_block[iblk].pb->pb_graph_node->num_input_ports == 1);
3598  for (i = 0;
3600  i++) {
3601  if(logical_block[iblk].input_nets[0][i] != OPEN) {
3602  for (j = 0; j < logical_block[iblk].pb->pb_graph_node->num_input_pins[0]; j++) {
3603  if (irr_graph[logical_block[iblk].pb->pb_graph_node->input_pins[0][j].pin_count_in_cluster].net_num
3604  != OPEN) {
3605  if (irr_graph[logical_block[iblk].pb->pb_graph_node->input_pins[0][j].pin_count_in_cluster].net_num == logical_block[iblk].input_nets[0][i]) {
3606  fprintf(fpout, "tnode_%d ",
3607  get_tnode_index(irr_graph[logical_block[iblk].pb->pb_graph_node->input_pins[0][j].pin_count_in_cluster].tnode));
3608  break;
3609  }
3610  }
3611  }
3612  assert(j < logical_block[iblk].pb->pb_graph_node->num_input_pins[0]);
3613  }
3614  }
3615  assert(
3616  logical_block[iblk].pb->pb_graph_node->num_output_ports == 1);
3617  assert(
3618  logical_block[iblk].pb->pb_graph_node->num_output_pins[0] == 1);
3619  fprintf(fpout, "lut_%s\n", logical_block[iblk].name);
3620  node =
3621  get_tnode_index(irr_graph[logical_block[iblk].pb->pb_graph_node->output_pins[0][0].pin_count_in_cluster].tnode);
3622  }
3623  assert(node != OPEN);
3624  truth_table = logical_block[iblk].truth_table;
3625  while (truth_table) {
3626  fprintf(fpout, "%s\n", (char*) truth_table->data_vptr);
3627  truth_table = truth_table->next;
3628  }
3629  fprintf(fpout, "\n");
3630  fprintf(fpout, ".names lut_%s tnode_%d\n", logical_block[iblk].name,
3631  node);
3632  fprintf(fpout, "1 1\n\n");
3633  } else {
3634  /* This is a standard .subckt blif structure */
3635  fprintf(fpout, ".subckt %s ", logical_block[iblk].model->name);
3636  if (logical_block[iblk].pb == NULL) {
3637  i = 0;
3638  port = logical_block[iblk].model->inputs;
3639  while (port) {
3640  if (!port->is_clock) {
3641  for (j = 0; j < port->size; j++) {
3642  if (logical_block[iblk].input_net_tnodes[i][j] != NULL) {
3643  if (port->size > 1) {
3644  fprintf(fpout, "\\\n%s[%d]=tnode_%d ",
3645  port->name, j,
3646  get_tnode_index(logical_block[iblk].input_net_tnodes[i][j]));
3647  } else {
3648  fprintf(fpout, "\\\n%s=tnode_%d ", port->name,
3649  get_tnode_index(logical_block[iblk].input_net_tnodes[i][j]));
3650  }
3651  } else {
3652  if (port->size > 1) {
3653  fprintf(fpout, "\\\n%s[%d]=unconn ", port->name,
3654  j);
3655  } else {
3656  fprintf(fpout, "\\\n%s=unconn ", port->name);
3657  }
3658  }
3659  }
3660  i++;
3661  }
3662  port = port->next;
3663  }
3664 
3665  i = 0;
3666  port = logical_block[iblk].model->outputs;
3667  while (port) {
3668  for (j = 0; j < port->size; j++) {
3669  if (logical_block[iblk].output_net_tnodes[i][j] != NULL) {
3670  if (port->size > 1) {
3671  fprintf(fpout, "\\\n%s[%d]=%s ", port->name, j,
3672  vpack_net[logical_block[iblk].output_nets[i][j]].name);
3673  } else {
3674  fprintf(fpout, "\\\n%s=%s ", port->name,
3675  vpack_net[logical_block[iblk].output_nets[i][j]].name);
3676  }
3677  } else {
3678  if (port->size > 1) {
3679  fprintf(fpout, "\\\n%s[%d]=unconn_%d_%s_%d ",
3680  port->name, j, iblk, port->name, j);
3681  } else {
3682  fprintf(fpout, "\\\n%s=unconn_%d_%s ", port->name,
3683  iblk, port->name);
3684  }
3685  }
3686  }
3687  i++;
3688  port = port->next;
3689  }
3690 
3691  i = 0;
3692  port = logical_block[iblk].model->inputs;
3693  while (port) {
3694  if (port->is_clock) {
3695  assert(port->size == 1);
3696  if (logical_block[iblk].clock_net_tnode != NULL) {
3697  fprintf(fpout, "\\\n%s=tnode_%d ", port->name,
3698  get_tnode_index(logical_block[iblk].clock_net_tnode));
3699  } else {
3700  fprintf(fpout, "\\\n%s=unconn ", port->name);
3701  }
3702  i++;
3703  }
3704  port = port->next;
3705  }
3706 
3707  fprintf(fpout, "\n\n");
3708 
3709  i = 0;
3710  port = logical_block[iblk].model->outputs;
3711  while (port) {
3712  for (j = 0; j < port->size; j++) {
3713  if (logical_block[iblk].output_net_tnodes[i][j] != NULL) {
3714  fprintf(fpout, ".names %s tnode_%d\n",
3715  vpack_net[logical_block[iblk].output_nets[i][j]].name,
3716  get_tnode_index(logical_block[iblk].output_net_tnodes[i][j]));
3717  fprintf(fpout, "1 1\n\n");
3718  }
3719  }
3720  i++;
3721  port = port->next;
3722  }
3723  } else {
3724  irr_graph = logical_block[iblk].pb->rr_graph;
3725  pb_graph_node = logical_block[iblk].pb->pb_graph_node;
3726  for (i = 0; i < pb_graph_node->num_input_ports; i++) {
3727  for (j = 0; j < pb_graph_node->num_input_pins[i]; j++) {
3728  if (irr_graph[pb_graph_node->input_pins[i][j].pin_count_in_cluster].net_num
3729  != OPEN) {
3730  if (pb_graph_node->num_input_pins[i] > 1) {
3731  fprintf(fpout, "\\\n%s[%d]=tnode_%d ",
3732  pb_graph_node->input_pins[i][j].port->name,
3733  j,
3734  get_tnode_index(irr_graph[pb_graph_node->input_pins[i][j].pin_count_in_cluster].tnode));
3735  } else {
3736  fprintf(fpout, "\\\n%s=tnode_%d ",
3737  pb_graph_node->input_pins[i][j].port->name,
3738  get_tnode_index(irr_graph[pb_graph_node->input_pins[i][j].pin_count_in_cluster].tnode));
3739  }
3740  } else {
3741  if (pb_graph_node->num_input_pins[i] > 1) {
3742  fprintf(fpout, "\\\n%s[%d]=unconn ",
3743  pb_graph_node->input_pins[i][j].port->name,
3744  j);
3745  } else {
3746  fprintf(fpout, "\\\n%s=unconn ",
3747  pb_graph_node->input_pins[i][j].port->name);
3748  }
3749  }
3750  }
3751  }
3752  for (i = 0; i < pb_graph_node->num_output_ports; i++) {
3753  for (j = 0; j < pb_graph_node->num_output_pins[i]; j++) {
3754  if (irr_graph[pb_graph_node->output_pins[i][j].pin_count_in_cluster].net_num
3755  != OPEN) {
3756  if (pb_graph_node->num_output_pins[i] > 1) {
3757  fprintf(fpout, "\\\n%s[%d]=%s ",
3758  pb_graph_node->output_pins[i][j].port->name,
3759  j,
3760  vpack_net[irr_graph[pb_graph_node->output_pins[i][j].pin_count_in_cluster].net_num].name);
3761  } else {
3762  char* port_name =
3763  pb_graph_node->output_pins[i][j].port->name;
3764  int pin_count =
3765  pb_graph_node->output_pins[i][j].pin_count_in_cluster;
3766  int node_index = irr_graph[pin_count].net_num;
3767  char* node_name = vpack_net[node_index].name;
3768  fprintf(fpout, "\\\n%s=%s ", port_name, node_name);
3769  }
3770  } else {
3771  if (pb_graph_node->num_output_pins[i] > 1) {
3772  fprintf(fpout, "\\\n%s[%d]=unconn ",
3773  pb_graph_node->output_pins[i][j].port->name,
3774  j);
3775  } else {
3776  fprintf(fpout, "\\\n%s=unconn ",
3777  pb_graph_node->output_pins[i][j].port->name);
3778  }
3779  }
3780  }
3781  }
3782  for (i = 0; i < pb_graph_node->num_clock_ports; i++) {
3783  for (j = 0; j < pb_graph_node->num_clock_pins[i]; j++) {
3784  if (irr_graph[pb_graph_node->clock_pins[i][j].pin_count_in_cluster].net_num
3785  != OPEN) {
3786  if (pb_graph_node->num_clock_pins[i] > 1) {
3787  fprintf(fpout, "\\\n%s[%d]=tnode_%d ",
3788  pb_graph_node->clock_pins[i][j].port->name,
3789  j,
3790  get_tnode_index(irr_graph[pb_graph_node->clock_pins[i][j].pin_count_in_cluster].tnode));
3791  } else {
3792  fprintf(fpout, "\\\n%s=tnode_%d ",
3793  pb_graph_node->clock_pins[i][j].port->name,
3794  get_tnode_index(irr_graph[pb_graph_node->clock_pins[i][j].pin_count_in_cluster].tnode));
3795  }
3796  } else {
3797  if (pb_graph_node->num_clock_pins[i] > 1) {
3798  fprintf(fpout, "\\\n%s[%d]=unconn ",
3799  pb_graph_node->clock_pins[i][j].port->name,
3800  j);
3801  } else {
3802  fprintf(fpout, "\\\n%s=unconn ",
3803  pb_graph_node->clock_pins[i][j].port->name);
3804  }
3805  }
3806  }
3807  }
3808 
3809  fprintf(fpout, "\n\n");
3810  /* connect up output port names to output tnodes */
3811  for (i = 0; i < pb_graph_node->num_output_ports; i++) {
3812  for (j = 0; j < pb_graph_node->num_output_pins[i]; j++) {
3813  if (irr_graph[pb_graph_node->output_pins[i][j].pin_count_in_cluster].net_num
3814  != OPEN) {
3815  fprintf(fpout, ".names %s tnode_%d\n",
3816  vpack_net[irr_graph[pb_graph_node->output_pins[i][j].pin_count_in_cluster].net_num].name,
3817  get_tnode_index(irr_graph[pb_graph_node->output_pins[i][j].pin_count_in_cluster].tnode));
3818  fprintf(fpout, "1 1\n\n");
3819  }
3820  }
3821  }
3822  }
3823  }
3824 }
static int find_cf_constraint(char *source_clock_name, char *sink_ff_name)
Definition: path_delay.c:3018
int * node_block_pin
Definition: vpr_types.h:509
static void alloc_and_load_tnodes_from_prepacked_netlist(float block_delay, float inter_cluster_net_delay)
Definition: path_delay.c:995
struct s_ivec * tnodes_at_level
Definition: path_delay2.c:12
Definition: util.h:57
void print_timing_stats(void)
Definition: path_delay.c:3081
t_pb_graph_pin * get_pb_graph_node_pin_from_model_port_pin(t_model_ports *model_port, int model_pin, t_pb_graph_node *pb_graph_node)
Definition: vpr_utils.c:303
static t_chunk tedge_ch
Definition: path_delay.c:153
short num_edges
Definition: vpr_types.h:901
char * name
Definition: vpr_types.h:179
void check_timing_graph(int num_sinks)
Definition: path_delay2.c:161
t_pb_graph_pin ** clock_pins
FILE * my_fopen(const char *fname, const char *flag, int prompt)
Definition: util.c:54
static t_tnode * find_ff_clock_tnode(int inode, boolean is_prepacked)
Definition: path_delay.c:2955
int * edges
Definition: vpr_types.h:903
#define NUM_BUCKETS
Definition: path_delay.c:148
static int num_timing_nets
Definition: path_delay.c:157
float T_req
Definition: vpr_types.h:344
t_model_ports * model_port
void print_timing_graph_as_blif(const char *fname, t_model *models)
Definition: path_delay.c:3360
int ** input_nets
Definition: vpr_types.h:211
void free_override_constraint(t_override_constraint *&constraint_array, int num_constraints)
Definition: read_sdc.c:1297
boolean is_const_gen
Definition: vpr_types.h:511
struct s_linked_vptr * circuit_p_io_removed
Definition: globals.c:90
int net_num
Definition: vpr_types.h:917
struct s_rr_node * rr_graph
Definition: vpr_types.h:188
int num_tnodes
Definition: path_delay.c:144
t_tnode * tnode
Definition: vpr_types.h:919
boolean
Definition: util.h:11
static t_slack * alloc_slacks(void)
Definition: path_delay.c:344
int data
Definition: util.h:40
float pb_max_internal_delay
Definition: globals.c:93
float * pin_timing_del_max
static void do_lut_rebalancing()
Definition: path_delay.c:1877
float get_critical_path_delay(void)
Definition: path_delay.c:3060
float ** least_slack
Definition: vpr_types.h:399
int * list
Definition: util.h:49
t_pb_graph_pin ** output_pins
static int find_input(char *net_name)
Definition: path_delay.c:2994
static float ** net_delay
struct s_pb_graph_edge ** output_edges
boolean is_netlist_clock
Definition: vpr_types.h:370
static void print_primitive_as_blif(FILE *fpout, int iblk)
Definition: path_delay.c:3452
void * my_chunk_malloc(size_t size, t_chunk *chunk_info)
Definition: util.c:184
float T_arr
Definition: vpr_types.h:343
boolean timing_analysis_enabled
static t_timing_stats * f_timing_stats
Definition: path_delay.c:159
void * my_calloc(size_t nelem, size_t size)
Definition: util.c:132
void get_tnode_block_and_output_net(int inode, int *iblk_ptr, int *inet_ptr)
Definition: path_delay.c:2611
struct s_model_ports * next
Definition: logic_types.h:28
static void load_tnode(INP t_pb_graph_pin *pb_graph_pin, INP int iblock, INOUTP int *inode, INP t_timing_inf timing_inf)
Definition: path_delay.c:1297
e_tnode_type
Definition: vpr_types.h:300
int num_nets
Definition: globals.c:27
int * lut_pin_remap
Definition: vpr_types.h:197
int * node_block
Definition: vpr_types.h:507
const t_pb_type * pbtype_max_internal_delay
Definition: globals.c:94
struct s_linked_vptr * truth_table
Definition: vpr_types.h:227
int clock_domain
Definition: vpr_types.h:354
t_linked_int * allocate_and_load_critical_path(void)
Definition: path_delay.c:2522
#define DO_NOT_ANALYSE
Definition: path_delay.h:4
int num_logical_nets
Definition: globals.c:17
void print_critical_path(const char *fname)
Definition: path_delay.c:2458
void do_timing_analysis(t_slack *slacks, boolean is_prepacked, boolean do_lut_input_balancing, boolean is_final_analysis)
Definition: path_delay.c:1613
float ** domain_constraint
Definition: vpr_types.h:431
void print_clustering_timing_info(const char *fname)
Definition: path_delay.c:696
struct s_pb_graph_pin ** pin_timing
char * name
Definition: vpr_types.h:505
t_pb_graph_pin ** output_pins
struct s_pb ** rr_node_to_pb_mapping
Definition: vpr_types.h:189
float ** slack
Definition: vpr_types.h:405
struct s_linked_vptr * chunk_ptr_head
Definition: util.h:58
t_type_ptr type
Definition: vpr_types.h:561
static void propagate_clock_domain_and_skew(int inode)
Definition: path_delay.c:2905
boolean is_clock
Definition: logic_types.h:26
#define INOUTP
Definition: util.h:21
int num_blocks
Definition: globals.c:30
char * blif_circuit_name
Definition: globals.c:21
enum e_pb_graph_pin_type type
#define UNDEFINED
Definition: vpr_types.h:103
boolean getEchoEnabled(void)
Definition: ReadOptions.c:67
t_prepacked_tnode_data * prepacked_data
Definition: vpr_types.h:361
static void load_clock_domain_and_clock_and_io_delay(boolean is_prepacked)
Definition: path_delay.c:2807
Definition: util.h:12
int num_edges
Definition: vpr_types.h:342
int num_tnode_levels
Definition: path_delay2.c:10
char * name
Definition: vpr_types.h:377
static void print_spaces(FILE *fp, int num_spaces)
Definition: path_delay.c:3353
#define min(a, b)
Definition: graphics.c:174
float ** cpd
Definition: vpr_types.h:398
struct s_tnode * clock_net_tnode
Definition: vpr_types.h:225
static float do_timing_analysis_for_constraint(int source_clock_domain, int sink_clock_domain, boolean is_prepacked, boolean is_final_analysis, long *max_critical_input_paths_ptr, long *max_critical_output_paths_ptr)
Definition: path_delay.c:1926
t_override_constraint * fc_constraints
Definition: vpr_types.h:446
t_slack * alloc_and_load_pre_packing_timing_graph(float block_delay, float inter_cluster_net_delay, t_model *models, t_timing_inf timing_inf)
Definition: path_delay.c:288
int block
Definition: vpr_types.h:346
#define HUGE_POSITIVE_FLOAT
Definition: vpr_types.h:79
void read_sdc(t_timing_inf timing_inf)
Definition: read_sdc.c:115
float normalized_total_critical_paths
Definition: vpr_types.h:328
static void * my_malloc(int ibytes)
Definition: graphics.c:499
t_override_constraint * cf_constraints
Definition: vpr_types.h:443
void do_constant_net_delay_timing_analysis(t_timing_inf timing_inf, float constant_net_delay_value)
Definition: path_delay.c:2636
boolean * is_global
t_tnode * tnode
Definition: path_delay.c:143
struct s_tnode *** input_net_tnodes
Definition: vpr_types.h:223
void print_lut_remapping(const char *fname)
Definition: path_delay.c:2430
int * vpack_to_clb_net_mapping
Definition: globals.c:34
struct s_model * next
Definition: logic_types.h:40
#define max(a, b)
Definition: graphics.c:171
struct s_block * block
Definition: globals.c:31
struct s_net * clb_net
Definition: globals.c:28
int nelem
Definition: util.h:48
int fanout
Definition: vpr_types.h:371
#define INP
Definition: util.h:19
void free_ivec_vector(struct s_ivec *ivec_vector, int nrmin, int nrmax)
Definition: util.c:498
char * name
boolean has_valid_normalized_T_arr(int inode)
Definition: path_delay.c:3054
struct s_pb_graph_node * parent_node
static void process_constraints(void)
Definition: path_delay.c:1489
static t_chunk net_delay_ch
Definition: timing_place.c:15
char * name
Definition: vpr_types.h:369
static void update_slacks(t_slack *slacks, int source_clock_domain, int sink_clock_domain, float criticality_denom, boolean update_slack)
Definition: path_delay.c:2344
void free_int_list(t_linked_int **int_list_head_ptr)
Definition: util.c:341
static void print_timing_constraint_info(const char *fname)
Definition: path_delay.c:3222
t_model_ports * inputs
Definition: logic_types.h:35
static void alloc_and_load_tnodes(t_timing_inf timing_inf)
Definition: path_delay.c:730
t_model_ports * outputs
Definition: logic_types.h:36
boolean isEchoFileEnabled(enum e_echo_files echo_option)
Definition: ReadOptions.c:115
static int get_tnode_index(t_tnode *node)
Definition: path_delay.c:3037
static int * f_net_to_driver_tnode
Definition: path_delay.c:161
float Tdel
Definition: vpr_types.h:297
struct s_linked_vptr * next
Definition: util.h:36
void free_net_delay(float **net_delay, t_chunk *chunk_list_ptr)
Definition: net_delay.c:127
float delay
Definition: vpr_types.h:379
t_clock * constrained_clocks
Definition: vpr_types.h:429
t_pb_graph_pin * pb_graph_pin
Definition: vpr_types.h:358
static char * model
Definition: read_blif.c:45
void * data_vptr
Definition: util.h:35
float ** alloc_net_delay(t_chunk *chunk_list_ptr, struct s_net *nets, int n_nets)
Definition: net_delay.c:103
t_slack * alloc_and_load_timing_graph(t_timing_inf timing_inf)
Definition: path_delay.c:239
static void alloc_timing_stats(void)
Definition: path_delay.c:1598
#define EPSILON
Definition: vpr_types.h:83
t_tedge * out_edges
Definition: vpr_types.h:336
void print_criticality(t_slack *slacks, boolean criticality_is_normalized, const char *fname)
Definition: path_delay.c:559
struct s_pb_type * pb_type
static struct s_net * timing_nets
Definition: path_delay.c:155
static void update_normalized_costs(float T_arr_max_this_domain, long max_critical_input_paths, long max_critical_output_paths)
Definition: path_delay.c:2677
void load_constant_net_delay(float **net_delay, float delay_value, struct s_net *nets, int n_nets)
Definition: net_delay.c:175
void free_timing_stats(void)
Definition: path_delay.c:427
char * name
Definition: logic_types.h:34
static void print_global_criticality_stats(FILE *fp, float **criticality, const char *singular_name, const char *capitalized_plural_name)
Definition: path_delay.c:606
float ** timing_criticality
Definition: vpr_types.h:406
static boolean has_valid_T_req(int inode)
Definition: path_delay.c:3048
t_pb_graph_pin * pb_graph_pin
Definition: vpr_types.h:918
int to_node
Definition: vpr_types.h:296
t_model_ports * model_port_ptr
Definition: vpr_types.h:324
char * clock_name
Definition: vpr_types.h:378
static int find_clock(char *net_name)
Definition: path_delay.c:2982
t_io * constrained_outputs
Definition: vpr_types.h:437
int num_logical_blocks
Definition: globals.c:17
Definition: slre.c:50
void print_slack(float **slack, boolean slack_is_normalized, const char *fname)
Definition: path_delay.c:441
t_override_constraint * ff_constraints
Definition: vpr_types.h:449
t_pb * pb
Definition: vpr_types.h:567
struct s_linked_int * next
Definition: util.h:41
void print_timing_graph(const char *fname)
Definition: path_delay.c:1388
float print_critical_path_node(FILE *fp, t_linked_int *critical_path_node)
Definition: path_delay2.c:199
static int find_output(char *net_name)
Definition: path_delay.c:3006
t_override_constraint * cc_constraints
Definition: vpr_types.h:440
e_tnode_type type
Definition: vpr_types.h:335
struct s_tnode *** output_net_tnodes
Definition: vpr_types.h:224
struct s_net * vpack_net
Definition: globals.c:19
char * getEchoFileName(enum e_echo_files echo_option)
Definition: ReadOptions.c:122
float clock_delay
Definition: vpr_types.h:355
static char * find_tnode_net_name(int inode, boolean is_prepacked)
Definition: path_delay.c:2936
int prev_edge
Definition: vpr_types.h:916
t_timing_constraints * g_sdc
Definition: read_sdc.c:65
float max_internal_delay
void free_chunk_memory(t_chunk *chunk_info)
Definition: util.c:270
t_model * model
Definition: vpr_types.h:209
int num_output_pins
#define NEGATIVE_EPSILON
Definition: vpr_types.h:84
int ** output_nets
Definition: vpr_types.h:212
messagelogger vpr_printf
Definition: util.c:17
t_pb_graph_node * pb_graph_node
Definition: vpr_types.h:180
static void set_and_balance_arrival_time(int to_node, int from_node, float Tdel, boolean do_lut_input_balancing)
Definition: path_delay.c:2707
void free_timing_graph(t_slack *slacks)
Definition: path_delay.c:390
#define HUGE_NEGATIVE_FLOAT
Definition: vpr_types.h:80
int num_sinks
Definition: vpr_types.h:506
void print_net_delay(float **net_delay, const char *fname)
Definition: path_delay.c:670
void load_timing_graph_net_delays(float **net_delay)
Definition: path_delay.c:368
t_io * constrained_inputs
Definition: vpr_types.h:434
static boolean has_valid_T_arr(int inode)
Definition: path_delay.c:3043
int * node_block_port
Definition: vpr_types.h:508
t_pb_graph_pin ** input_pins
struct s_logical_block * logical_block
Definition: globals.c:20
int num_clock_pins
Definition: util.h:12
int alloc_and_load_timing_graph_levels(void)
Definition: path_delay2.c:81