/ 在不同的消息类别上对不同的一致性消息类型进行建模。
// GarnetSyntheticTraffic 采用 Garnet_standalone 一致性协议
// 它对三个消息类/虚拟网络进行建模。
// 它们是:请求、转发、响应。
// 请求和转发是“控制”数据包(通常为 8 字节),
// 而响应是“数据”包(通常为 72 字节)。
// 数据包从测试仪进入网络的生命周期:
// (1) 该函数generatePkt()生成其中之一的数据包
// 以下 3 种类型(随机):ReadReq、INST_FETCH、WriteReq
// (2) mem/ruby/system/RubyPort.cc 将它们转换为 RubyRequestType_LD,
// 分别为 RubyRequestType_IFETCH、RubyRequestType_ST
// (3) mem/ruby/system/Sequencer.cc 将这些发送到缓存控制器
// 在一致性协议中。
// (4) Network_test-cache.sm 标签 RubyRequestType:LD,
// RubyRequestType: IFETCH 和 RubyRequestType: ST as
// 分别为请求、转发和响应事件;
// 并将它们分别注入到虚拟网络0、1和2中。
// 它立即回调定序器。
// (5) 数据包遍历网络(simple/garnet)并到达其
// 目的地(目录)和网络统计信息已更新。
// (6) Network_test-dir.sm 只是丢弃数据包。
先启动docker,然后cd 进gem5文件夹
sudo docker run -u $UID:$GID --volume /home/yz/myprojects/2024GEM5/parsec-tests/yzmodifiedgem5/:/gem5 --rm -it gcr.io/gem5-test/ubuntu-22.04_all-dependencies:v22-1 #docker run -u $UID:$GID --volume <gem5 directory>:/gem5 --rm -it <image>
按照官网,编译 garnet standaalone.
scons build/NULL/gem5.debug PROTOCOL=Garnet_standalone
./build/NULL/gem5.debug configs/example/garnet_synth_traffic.py \
--num-cpus=16 \
--num-dirs=16 \
--network=garnet2.0 \
--topology=Mesh_XY \
--mesh-rows=4 \
--sim-cycles=1000 \
--synthetic=uniform_random \
这个config python文件,cpu类型是 GarnetSyntheticTraffic:
cpus = [
for i in range(args.num_cpus)
src/cpu/testers/garnet_synthetic_traffic/GarnetSyntheticTraffic.py 下,pybind了cpp的代码
class GarnetSyntheticTraffic(ClockedObject):
type = "GarnetSyntheticTraffic"
cxx_header = (
cxx_class = "gem5::GarnetSyntheticTraffic"
garnet_synth_traffic.py中的 一大串例如 num_packets_max=args.num_packets_max, 会作为 const Params &p传递给cpp,而src/cpu/testers/garnet_synthetic_traffic/GarnetSyntheticTraffic.cc中,创建时就对成员变量初始化:
GarnetSyntheticTraffic::GarnetSyntheticTraffic(const Params &p) : ClockedObject(p), tickEvent([this]{ tick(); }, "GarnetSyntheticTraffic tick", false, Event::CPU_Tick_Pri), cachePort("GarnetSyntheticTraffic", this), retryPkt(NULL), size(p.memory_size), blockSizeBits(p.block_offset), numDestinations(p.num_dest), simCycles(p.sim_cycles), numPacketsMax(p.num_packets_max), numPacketsSent(0), singleSender(p.single_sender), singleDest(p.single_dest), trafficType(p.traffic_type), injRate(p.inj_rate), injVnet(p.inj_vnet), precision(p.precision), responseLimit(p.response_limit), requestorId(p.system->getRequestorId(this))
在 C++ 中,冒号 ( : )在构造函数中用于初始化成员变量和基类。这种语法称为初始化列表。初始化列表紧跟在构造函数声明的后面,并在函数体执行之前初始化类的成员。提供的代码中,GarnetSyntheticTraffic 类的构造函数使用初始化列表来初始化其成员变量和基类。
ClockedObject§:这是对基类 ClockedObject 的构造函数的调用。它使用参数 p(一个 Params 结构体)来初始化基类部分的 GarnetSyntheticTraffic 对象。
后续的每一行(例如,tickEvent([this]{ tick(); }, “GarnetSyntheticTraffic tick”, false, Event::CPU_Tick_Pri))都是成员变量的初始化。每个成员变量都使用特定的值或表达式进行初始化。例如:
tickEvent 成员使用一个 lambda 函数、一个字符串和两个布尔值进行初始化。
cachePort 使用字符串和 this 指针(指向当前对象)进行初始化。
最后一个成员变量 requestorId 是使用 p.system->getRequestorId(this) 的返回值进行初始化。
命令行有一些args 例如–injectionrate=0.01,还有 --synthetic=uniform_random ,传递给ruby createsystem.
Ruby.create_system(args, False, system)
比如 uniform_random,会(通过某种方式,目前还没解读到)传递到 src/cpu/testers/garnet_synthetic_traffic/GarnetSyntheticTraffic.cc中,
else if (traffic == UNIFORM_RANDOM_) {
destination = random_mt.random(0, num_destinations - 1);
GarnetSyntheticTraffic::generatePkt() 根据vnet不同,创建不同的 req .
req打包变成 PacketPtr pkt = new Packet(req, requestType);
sendPkt(pkt); 发送出去.
GarnetSyntheticTraffic::sendPkt(PacketPtr pkt)
if (!cachePort.sendTimingReq(pkt)) {
retryPkt = pkt; // RubyPort will retry sending
!cachePort.sendTimingReq(pkt) 用了 sendTimingReq.
这个函数细节在 src/mem/port.hh里:
RequestPort::sendTimingReq(PacketPtr pkt)
try {
bool succ = TimingRequestProtocol::sendReq(_responsePort, pkt);
if (!succ)
return succ;
} catch (UnboundPortException) {
s这里的 rc/mem/port.hh中的RequestPort::sendTimingReq 使用的 _responsePort传递进 TimingRequestProtocol::sendReq函数里,作为 *peer.
这里,从port.hh的 sendTimingReq,到下一步我们要看到 src/mem/protocol/timing.cc 中的 sendReq.
src/mem/protocol/timing.cc里, sendReq 被使用了,而 sendReq内部,则是使用了 peer->recvTimingReq(pkt).
/* The request protocol. */
TimingRequestProtocol::sendReq(TimingResponseProtocol *peer, PacketPtr pkt)
return peer->recvTimingReq(pkt);
这里TimingRequestProtocol的peer是 src/mem/protocol/timing.hh中 class TimingResponseProtocol类 , 这个类的官方注释里写了,
@param peer Peer to send the packet to.
* @param pkt Packet to send.
src/mem/ruby/system/RubyPort.cc 中会用一个 makeRequest
RubyPort::MemResponsePort::recvTimingReq(PacketPtr pkt)
// Submit the ruby request
RequestStatus requestStatus = owner.makeRequest(pkt);
src/mem/ruby/system/Sequencer.cc 中 ,Sequencer 会有一个 issueRequest 的操作
Sequencer::makeRequest(PacketPtr pkt)
// non-aliased with any existing request in the request table, just issue
// to the cache
if (status != RequestStatus_Aliased)
issueRequest(pkt, secondary_type);
// TODO: issue hardware prefetches here
return RequestStatus_Issued;
//创建一个ruby request
std::shared_ptr msg;
msg = pkt.各种操作//将pkt变成ruby request
m_mandatory_q_ptr->enqueue(msg, clockEdge(), latency);//插入 m_mandatory_q_ptr-
其实能找到这里是逆推的,csdn的小伙伴告诉我有资料说 mandatoryqueue是request进入网络的关键,于是从 mandatoryqueue倒推找到pkt的send后再顺着写下来. 不然,附录里有太多 recvTimingReq很容易混淆. ## m_mandatory_q_ptr = m_controller->getMandatoryQueue(); 在 src/mem/ruby/system/RubyPort.cc中 src/mem/ruby/system/RubyPort.hh中定义了类型是msg buffer: MessageBuffer* m_mandatory_q_ptr; m_mandatory_q_ptr有enqueue操作没有dequeue操作,说明它的deque操作是 这个mandatoryqueue的另一个名字操作. ## NI 中 CHECK MESSAGE BUFFER 并且 flitisizeMessage ```cpp void NetworkInterface::wakeup() { std::ostringstream oss; for (auto &oPort: outPorts) { oss << oPort->routerID() << "[" << oPort->printVnets() << "] "; } DPRINTF(RubyNetwork, "Network Interface %d connected to router:%s " "woke up. Period: %ld\n", m_id, oss.str(), clockPeriod()); assert(curTick() == clockEdge()); MsgPtr msg_ptr; Tick curTime = clockEdge(); // Checking for messages coming from the protocol // can pick up a message/cycle for each virtual net for (int vnet = 0; vnet < inNode_ptr.size(); ++vnet) { MessageBuffer *b = inNode_ptr[vnet]; if (b == nullptr) { continue; } if (b->isReady(curTime)) { // Is there a message waiting msg_ptr = b->peekMsgPtr(); if (flitisizeMessage(msg_ptr, vnet)) { b->dequeue(curTime); } } }
到这里,一个pkt 就变成msg,存进message buffer,然后变成了flits,进入了noc 网络.
其他的相近代码 没删,只是为了备用
recvTimingReq(gem5::PacketPtr pkt) override
return bridge.recvTimingReq(pkt);
/** * Similar to TLM's non-blocking transport (AT) */ bool SCSlavePort::recvTimingReq(gem5::PacketPtr packet) { CAUGHT_UP; panic_if(packet->cacheResponding(), "Should not see packets where cache " "is responding"); panic_if(!(packet->isRead() || packet->isWrite()), "Should only see read and writes at TLM memory\n"); /* We should never get a second request after noting that a retry is * required */ sc_assert(!needToSendRequestRetry); /* Remember if a request comes in while we're blocked so that a retry * can be sent to gem5 */ if (blockingRequest) { needToSendRequestRetry = true; return false; } /* NOTE: normal tlm is blocking here. But in our case we return false * and tell gem5 when a retry can be done. This is the main difference * in the protocol: * if (requestInProgress) * { * wait(endRequestEvent); * } * requestInProgress = trans; */ /* Prepare the transaction */ tlm::tlm_generic_payload * trans = mm.allocate(); trans->acquire(); packet2payload(packet, *trans); /* Attach the packet pointer to the TLM transaction to keep track */ Gem5Extension* extension = new Gem5Extension(packet); trans->set_auto_extension(extension); /* * Pay for annotated transport delays. * * The header delay marks the point in time, when the packet first is seen * by the transactor. This is the point int time, when the transactor needs * to send the BEGIN_REQ to the SystemC world. * * NOTE: We drop the payload delay here. Normally, the receiver would be * responsible for handling the payload delay. In this case, however, * the receiver is a SystemC module and has no notion of the gem5 * transport protocol and we cannot simply forward the * payload delay to the receiving module. Instead, we expect the * receiving SystemC module to model the payload delay by deferring * the END_REQ. This could lead to incorrect delays, if the XBar * payload delay is longer than the time the receiver needs to accept * the request (time between BEGIN_REQ and END_REQ). * * TODO: We could detect the case described above by remembering the * payload delay and comparing it to the time between BEGIN_REQ and * END_REQ. Then, a warning should be printed. */ auto delay = sc_core::sc_time::from_value(packet->payloadDelay); // reset the delays packet->payloadDelay = 0; packet->headerDelay = 0; /* Starting TLM non-blocking sequence (AT) Refer to IEEE1666-2011 SystemC * Standard Page 507 for a visualisation of the procedure */ tlm::tlm_phase phase = tlm::BEGIN_REQ; tlm::tlm_sync_enum status; status = transactor->socket->nb_transport_fw(*trans, phase, delay); /* Check returned value: */ if (status == tlm::TLM_ACCEPTED) { sc_assert(phase == tlm::BEGIN_REQ); /* Accepted but is now blocking until END_REQ (exclusion rule)*/ blockingRequest = trans; } else if (status == tlm::TLM_UPDATED) { /* The Timing annotation must be honored: */ sc_assert(phase == tlm::END_REQ || phase == tlm::BEGIN_RESP); PayloadEvent<SCSlavePort> * pe; pe = new PayloadEvent<SCSlavePort>(*this, &SCSlavePort::pec, "PEQ"); pe->notify(*trans, phase, delay); } else if (status == tlm::TLM_COMPLETED) { /* Transaction is over nothing has do be done. */ sc_assert(phase == tlm::END_RESP); trans->release(); } return true; }
RequestPort::bind(Port &peer)
auto *response_port = dynamic_cast<ResponsePort *>(&peer);
fatal_if(!response_port, "Can't bind port %s to non-response port %s.",
name(), peer.name());
// request port keeps track of the response port
_responsePort = response_port;
// response port also keeps track of request port
src/mem/tport.cc 有一串代码,本质是 schedTimingResp.
bool SimpleTimingPort::recvTimingReq(PacketPtr pkt) { // the SimpleTimingPort should not be used anywhere where there is // a need to deal with snoop responses and their flow control // requirements if (pkt->cacheResponding()) panic("SimpleTimingPort should never see packets with the " "cacheResponding flag set\n"); bool needsResponse = pkt->needsResponse(); Tick latency = recvAtomic(pkt); // turn packet around to go back to requestor if response expected if (needsResponse) { // recvAtomic() should already have turned packet into // atomic response assert(pkt->isResponse()); schedTimingResp(pkt, curTick() + latency); } else { // queue the packet for deletion pendingDelete.reset(pkt); } return true; }
* Queue a response packet to be sent out later and also schedule
* a send if necessary.
* @param pkt a response to send out after a delay
* @param when tick when response packet should be sent
void schedTimingResp(PacketPtr pkt, Tick when);
GarnetSyntheticTraffic 会打包好packet,一个requst packet准备好了后, GarnetSyntheticTraffic::sendPkt 会调用 cachePort.sendTimingReq(pkt). 这个port.sendTimingReq会调用port内部函数 TimingRequestProtocol::sendReq函数.
TimingRequestProtocol::sendReq 里会把传入的pkt 和_responsePort 一起读进来,调用 _responsePort的函数recvTimingReq,也就是这里执行的 peer->recvTimingReq(pkt),其实是 _responsePort->recvTimingReq(pkt).
这个 _responsePort每次都是会变的,取决于何时bind.
而这里,request发出的包是直接相连,或者说 "虚空连接"到response的,并没有经过network. 这里用的函数也都是port.hh或者tport.hh.
还是从 void
GarnetSyntheticTraffic::sendPkt(PacketPtr pkt) 中使用的 cachePort.sendTimingReq(pkt)开始.
只不过,这次的 cachePort 是 RubyPort了.
发req还是用protocol里的 TimingRequestProtocol::sendReq.
之前我们看的是 src/mem/tport.cc 里的
SimpleTimingPort::recvTimingReq(PacketPtr pkt)
bool RubyPort::MemResponsePort::recvTimingReq(PacketPtr pkt) { DPRINTF(RubyPort, "Timing request for address %#x on port %d\n", pkt->getAddr(), id); if (pkt->cacheResponding()) panic("RubyPort should never see request with the " "cacheResponding flag set\n"); // ruby doesn't support cache maintenance operations at the // moment, as a workaround, we respond right away if (pkt->req->isCacheMaintenance()) { warn_once("Cache maintenance operations are not supported in Ruby.\n"); pkt->makeResponse(); schedTimingResp(pkt, curTick()); return true; } // Check for pio requests and directly send them to the dedicated // pio port. if (pkt->cmd != MemCmd::MemSyncReq) { if (!pkt->req->isMemMgmt() && !isPhysMemAddress(pkt)) { assert(owner.memRequestPort.isConnected()); DPRINTF(RubyPort, "Request address %#x assumed to be a " "pio address\n", pkt->getAddr()); // Save the port in the sender state object to be used later to // route the response pkt->pushSenderState(new SenderState(this)); // send next cycle RubySystem *rs = owner.m_ruby_system; owner.memRequestPort.schedTimingReq(pkt, curTick() + rs->clockPeriod()); return true; } } // Save the port in the sender state object to be used later to // route the response pkt->pushSenderState(new SenderState(this)); // Submit the ruby request RequestStatus requestStatus = owner.makeRequest(pkt); // If the request successfully issued then we should return true. // Otherwise, we need to tell the port to retry at a later point // and return false. if (requestStatus == RequestStatus_Issued) { DPRINTF(RubyPort, "Request %s 0x%x issued\n", pkt->cmdString(), pkt->getAddr()); return true; } // pop off sender state as this request failed to issue SenderState *ss = safe_cast<SenderState *>(pkt->popSenderState()); delete ss; if (pkt->cmd != MemCmd::MemSyncReq) { DPRINTF(RubyPort, "Request %s for address %#x did not issue because %s\n", pkt->cmdString(), pkt->getAddr(), RequestStatus_to_string(requestStatus)); } addToRetryList(); return false; }
bool RubyPort::PioResponsePort::recvTimingReq(PacketPtr pkt) { for (size_t i = 0; i < owner.request_ports.size(); ++i) { AddrRangeList l = owner.request_ports[i]->getAddrRanges(); for (auto it = l.begin(); it != l.end(); ++it) { if (it->contains(pkt->getAddr())) { // generally it is not safe to assume success here as // the port could be blocked [[maybe_unused]] bool success = owner.request_ports[i]->sendTimingReq(pkt); assert(success); return true; } } } panic("Should never reach here!\n"); }
