MJ Logic Design
Projects
The following details our projects over the past 10 years. While the majority of
this work was done by the consulting team of Mike Stewart and John Nash
(now MJ Logic Design), it also includes work we did as co-founders of a
startup, in addition to work we did as part of a larger consulting group. For
some of our more extensive projects, we have provided a link to a page with
more detailed project information.
Support Processor Sub-System: This block performed chip-level
configuration/supervision as well as “slow-path” packet processing in a next-
generation storage processor ASIC. The block consisted of a BVCI-based
infrastructure necessary to connect an ARC processor core with various on-
chip blocks (e.g., PCIe interface, buffer manager, DMA, etc) and
internal/external memories (DDR SRAM, DDR SDRAM, SRAM).
Layer 4 Processor: This block was part of an Application Delivery System
Platform, and was responsible for supporting L2-L7 traffic in hardware, with
an emphasis on terminating TCP connections and supporting subsequent L5-
7 processing. The design was partitioned across two Xilinx Virtex-E 3200E
FPGA devices operating in multiple clock domains (<=133MHz), and included
logic to manage numerous per-flow packet queues and corresponding data
structures.
Traffic Services Module (TSM) Card: This card was the centralized traffic
scheduling engine for an Application Delivery System Platform. The goal of
this platform was to “revolutionize the way applications are delivered over the
internet”. This card, realized with a series of large Virtex-E Xilinx FPGAs,
performed flow-based rate and response-time traffic shaping.
Packet Processor ASIC: This ASIC was used on a Quad OC-48 Line Card
(within a terabit router) and performed packet classification/routing at 8.33
MPkts/sec for IPv4, MPLS, and various control protocols, and supported either
a single OC-48 or quad OC-12 channels. The ASIC operated in multiple clock
domains (<=133MHz) and was implemented in 1.2M gates using LSI G11
process.
Full-Feature DMA Controller: This 24-channel, descriptor-based DMA
provided four AMBA-AHB Master interfaces for implementing DMA transfers
between system memory and various high- and low-speed peripherals. Key
features included: linked lists for per-channel descriptor chains, per-channel
transfer buffers to support scatter/gather, and multiple AHB master interfaces
to support both single- and dual-master transfers.
Programmable Micro-Coded Engine: This block was part of a TCP Offload
ASIC, and was a Very Long Instruction Word (VLIW) microcoded engine that
was tailored to perform packet parsing/building for layers 2-4
(Ethernet/IP/TCP) as well as upper layer protocols (e.g., ISCSI). This block
operated at 300MHz in a 0.13micron process, and consumed ~100k gates
plus instruction memory.
Ethernet/AV Controller ASIC: This ASIC provided bidirectional bridging of
isochronous audio and video streams between the multiple gigabit Ethernet
and programmable A/V interfaces (2 GbE, 4 audio, 1 video). Audio/video
formats included 61883 for GbE, IxS for serial audio, and MPEG2/BT656 for
parallel video. Presentation time was supported to allow synchronization of
related streams to be maintained as streams were independently
enabled/disabled.
Ethernet/AV Controller ASIC Verification: Developed the chip-level
verification environment and the sign-off regression suite to verify an
Ethernet/AV Controller ASIC. This included development of the packet
generators/checkers on the Ethernet side, in addition to the models of the
various CODECs on the A/V side. Overall, the environment verified end-to-end
data integrity in both directions (A/V to/from Ethernet), in addition to checking
the presentation time aspects of the data transfers, which keeps related
streams synchronized. We developed an extensive library of configuration,
sequencing, and bridging routines for the ASIC that emulated real-time
functions performed by the on-chip CPU (utilized later as software drivers
were developed). The environment was capable of very dynamic operation,
randomly taking various A/V streams up and down throughout the course of a
given simulation.
Ray-Tracing Accelerator Chip-Level Verification: Developed the chip-level
verification environment and the regression suite for a ray tracing accelerator
ASIC (first implemented in FPGAs). Worked with chip architect to understand
internal operation, then modified Northwest Logic PCIe test environment to
include stimulus generation and results processing to verify the graphics
core. We pushed the first scenes through the entire design, which required
understanding the scene databases as well as how intermediate results flow
through the design. Assumed responsibility for investigating all chip-level
failures and developing proposed RTL fixes (for subsequent approval by
designer) as necessary.
Ray-Tracing Accelerator Unit-Level Verification: Developed unit-level
environment and the regression suite for the core datapath and ray/scene
testing block in a ray-tracing accelerator ASIC. The
generator/scoreboard/checker environment we architected was capable of
randomizing all aspects of the input transaction-based streams. In support of
generating real-world traffic patterns, tests could set up various input
transaction threads, each with different properties. The environment utilized
the SystemVerilog (SV) Direct Programming Interface (DPI) for interfacing with
C functions for calculation of some of the math-intensive expected results. We
also developed a series of ray intersection tests for one of the key math
blocks, where we used dot- and cross-product math to randomly build the
necessary ray/scene combinations that stimulate important corner cases.
PCI/PCI-X Interface: This block supported transfers between an internal
AMBA-AHB bus and external PCI-X devices. The design supported both
master (AHB-to-PCI) and target (PCI-to-AHB) transfers. The Synopsys PCI-X
core was used to implement the basic PCI/PCI-X interface. The key features of
this block included: 32/64-bit PCI or PCI-X operation, host or non-host
operation, programmable address translation entries to convert from the AHB
address space to the PCI address space, and support for generating Type 0
and Type 1 PCI configuration cycles. The AHB slave interface supported
splitting up to four AHB masters.
Traffic Manager ASIC: This ASIC controlled the flow of packets from ingress
to egress on a Quad OC-48 Line Card within a terabit router. Features
included: QoS-based shaping, Weighted Random Early Discard (WRED), and
free buffer management. This 800K-gate ASIC was realized in the LSI G11
process. Product shipped first silicon.
DDR SDRAM Core Verification: Developed a Vera-based
generator/scoreboard/checker verification environment that not only checked
the functional correctness of a DDR SDRAM Controller Core, but also verified
the various performance optimization algorithms implemented by the core.
SPI-4 Interface Core: This block implemented a System Packet Interface
Level 4 (SPI-4) Phase 2 10Gbit/sec chip-to-chip interface core. Protocol
compliance was assured by using test suites provided by PMC-Sierra. The
design included the optional, but highly desirable, dynamic bit alignment
feature.
1G/10G PCS: This block provided the interface between four external 1G/10G
serial links and four corresponding internal FC/Ethernet MACs. These links
operated independently in 1G mode, but were synchronized in 10G mode to
implement a single high-speed link. Specific functions included: 8b/10b, idle
insertion/replacement, clock domain conversion, link synchronization, 10G
lane-to-lane deskew, and XGMII/XAUI translation.
Quad Ethernet Interface: This block included four 10/100/1000 Mb/s Ethernet
ports, and supported packet transfers between the four G/MII/TBI interfaces
and a single AHB slave interface. The design utilized the Mentor 10/100/1000
MAC/FIFO/Statistics cores, as well as custom logic to interface the cores to
the shared AHB bus interface. Significant modifications to the cores were
made to include features such as programmable cut-through operation, IPDA
address filtering, and a MAC-bypass mode that supported both proprietary
and 8/16-bit POS-PHY Level 3 (PL3) protocols.
Bridge FPGA: This Xilinx Virtex-4 Bridge FPGA extended the client’s ASIC,
allowing it to communicate with a series of processors and other peripherals
via a bi-directional proprietary-protocol messaging interface. The FPGA
contained two PowerPC 405 processors, various peripheral interfaces, and a
message-based infrastructure necessary to interconnect the processors,
multiple client ASIC devices, and the external peripherals. The external
peripherals included PCIe, DDR2 SDRAM, SRAM, and Flash memory. At the
core of the interconnect logic was a multi-ported message crossbar arbiter
that arbitrated on message boundaries and allowed simultaneous transfers
between any pair of ports. Surrounding the arbiter were various types of
converter modules to bridge between the messaging protocol and the native
interface of the various peripheral cores.
Line Card Verification Environment: Developed a series of port models
(packet generators/packet checkers) used to verify a Quad OC-48 Line Card
within a terabit router. These port models generated and verified both IPv4
and MPLS packets, and incorporated a “rules table” to predict how the packets
would be routed and classified. They also handled verifying packets as
different types of headers were added and subtracted as the packets flowed
across the Line Card.
Reduced-Area DMA: Developed reduced-area DMA that parameterized the
number of datapath interfaces, the datapath width, and the number of
channels. This enabled each instance to be customized to the per-instance
requirements within the client’s ASIC.
Buffer Manager: The Buffer Manager was an on-chip, cell-switched,
queuing/forwarding engine that used on-chip SRAM-based buffers and link-
lists to switch packets between various internal and external peripherals. We
extended this functionality into external DDR memory, where programmable
thresholds were provided to seamlessly switch between normal operation
and extended mode as necessary when the per-channel buffer pool became
low. In extended mode, cells were first buffered in DDR and retrieved later as
per-channel buffers became available.
Multi-Mode Backplane Interface ASIC: This six-mode ASIC supported
various high-speed data/message paths between individual port cards and
central memory for a 64-port Fibre Channel switch.
Ethernet/FC MACs and Cable Modem: Earlier projects included helping to
develop some of industry’s earliest Ethernet/Fiber Channel Gigabit MACs and
cable modems.


