Friday, January 12, 2018

Address Masking for RAM Devices

I found the problem of generating the correct address for a memory operation confusing and decided that it would be of value to me (and others) to take notes.

Assume we have an implementation of a slave device on a TileLink2 bus. It can receive either read (Get) or write (PutFullData) messages on channel A and correctly return the response on channel D.  More information about this in the spec.

The following covers the implementation of the module, specifically on the mask generation, nothing more.

Assumptions. The slave device covers address range 0x2000000, size 0x1FFF (or 8KB).  The system bus uses 8 bytes per beat (matching the 64 bit bus).  The memory width matches the bus width.  Number of memory words is 1024 and address width is 10 bits.

val address = in.a.bits.address // receive the full 64-bit physical address
val size = 0x1FFF // 8KB size, get using addressSet.mask
val size = in.a.size  // log2(bytes) of total amount of the data requested, e.g., 8B cache line == 1

Problem.  To compute the word-aligned address, you need a mask.  Here is an example worked out by hand.

2000_1000 ^ (0x2000_1000 & ~mask-1) = 0x1000 (byte aligned) // where mask = 0x2000
1000 / 8 = 0x200 (8B word aligned)
addr[9:0] = 0x200

That was the easy part.  I got confused by the operation of the circuit generator.  Isn't the following great ;)

val a_address   = Cat((mask zip (in.haddr >> log2Ceil(beatBytes)).toBools).filter(_._1).map(_._2).reverse)

This is definitely one the least appealing parts of functional programming.  It would take me a few hours to reverse engineer this.  How this should work is any non-trivial function should be placed in its own container and unit tested.  The unit tests are commented and serve as a description of the functional algorithm.  But, I digress ...

There is a helper object in the util package.  Moral of the story.  Use it.

// This gets used everywhere, so make the smallest circuit possible ...
// Given an address and size, create a mask of beatBytes size
// eg: (0x3, 0, 4) => 0001, (0x3, 1, 4) => 0011, (0x3, 2, 4) => 1111
// groupBy applies an interleaved OR reduction; groupBy=2 take 0010 => 01
object MaskGen {
  def apply(addr_lo: UInt, lgSize: UInt, beatBytes: Int, groupBy: Int = 1): UInt = {
[..]