CS144 lab

Git flow stackoverflow参考链接

本lab只与下图中的origin、public和local有关

origin: 位于remote的仓库,存放自己的代码

local: 本地的仓库,工作区

public: 位于remote的仓库,存放lab的初始代码,origin需要间断地git pull public branch_name以与public保持同步

画图

check0

2 Networking by hand

2.1 Fetch a Web page

  • telnet cs144.keithw.org http

    • 输入内容:

      1
      2
      3
      GET /lab0/wan_nan HTTP/1.1
      Host: cs144.keithw.org
      connection: close
    • 输出结果

      X-You-Said-Your-SunetID-Was: wan_nan

      X-Your-Code-Is: 433440

      image-20230413164540620
  • Ethernet的MTU(Maximum Transmission Unit)为什么是1500Byte

    MTU被设置为1500以平衡快速响应和最大吞吐量之间的矛盾

    详细说明

  • What is Telnet?

    • 基于TCP/IP

    • 可以使用HTTP/FTP服务

      telnet baidu.com httptelnet baidu.com 80等效(端口80是服务器侦听网页客户端请求的端口)

      使用以上命令以后 在terminal中逐行输入HTTP Request

      How to send an HTTP request using Telnet

    • 可以用来连接服务器、

      在连接服务器方面Telnet与SSH的区别:

      • telnet是明码传输,ssh是加密传输
      • 端口区别:telnet是23 ssh是22
  • 基本的HTTP request message

    1
    2
    3
    GET /hello HTTP/1.1
    host: www.baidu.com
    connection: close

    各个字段的含义

2.2 Send yourself an email

2.3 Listening and connecting

telnet: a client program that makes outgoing connections to programs running on other computers

server: the kind of program that waits around for clients to connect to it

3 Writing a network program using an OS stream socket

write a program called “webget”

Description of socket

You will make use of a feature provided by the Linux kernel, and by most other operating systems: the ability to create a reliable bidirectional byte stream between two programs, one running on your computer, and the other on a different computer across the Internet (e.g., a Web server such as Apache or nginx, or the netcat program).

This feature is known as a stream socket. To your program and to the Web server, the socket looks like an ordinary file descriptor (similar to a file on disk, or to the stdin or stdout I/O streams). When two stream sockets are connected, any bytes written to one socket will eventually come out in the same order from the other socket on the other computer.

Troubles may encounter

Although the network tries to deliver every datagram, in practice datagrams can be (1) lost, (2) delivered out of order, (3) delivered with the contents altered, or even (4) duplicated and delivered more than once. It’s normally the job of the operating systems on either end of the connection to turn “best-effort datagrams” (the abstraction the Internet provides) into “reliable byte streams” (the abstraction that applications usually want).

3.1 Let’s get started—fetching and building the starter code

Git flow 操作 记在了我的另一篇文章里


版本!!!

WSL Ubuntu 20.04 LTS 升级 22.04 LTS

切换gcc、g++版本

  • Ubuntu 22.04 uses libssl3, and thus libssl1.1 is deprecated at this point.

    我的参考解决方法

    Installati.one provides complete reference guide on how to install various applications in Linux


3.2 Modern C++: mostly safe but still fast and low-level

standards about modern C++ and Git

For references to this style, please see the C++ Core Guidelines (http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines).

The basic idea is to make sure that every object is designed to have the smallest possible public interface, has a lot of internal safety checks and is hard to use improperly, and knows how to clean up after itself.

3.3 Reading the Minnow support code

Please note that a Socket is a type of FileDescriptor, and a TCPSocket is a type of Socket.

3.4 Writing webget

代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
void get_URL( const string& host, const string& path )
{
cerr << "Function called: get_URL(" << host << ", " << path << ")\n";
// cerr << "Warning: get_URL() has not been implemented yet.\n";

// Address(): Construct by resolving a hostname and servicename.
Address serverAddr = Address( host, "http" );
string serverIp = serverAddr.ip();
// uint16_t serverPort = serverAddr.port();
// cout << serverIp << " " << serverPort << endl;
TCPSocket clientSocket;
clientSocket.connect( serverAddr );
string request = "GET " + path + " HTTP/1.1\r\n" +
"Host: " + host + "\r\n" +
"Connection: close" + "\r\n\r\n";
// cout << request << endl;
clientSocket.write( request );

string recvPayload;
while ( !clientSocket.eof() ) {
clientSocket.read(recvPayload);
cout << recvPayload;
recvPayload.clear();
}
clientSocket.close();
// cout << recvPayload << endl;
}
  • 发送报文直接调用的是socket的write接口,接收报文直接调用的是socket的read接口

    因为the socket looks like an ordinary file descriptor (similar to a file on disk, or to the stdin or stdout I/O streams)

  • 读取接收报文的时候读到EOF才算结束,要使用循环直至读到EOF

  • C++17的string_view

    std::string_view只是记录了自己对应的字符串的指针和偏移位置

    它提供一个字符串的视图,即可以通过这个类以各种方法“观测”字符串,但不允许修改字符串。由于它只读的特性,它并不真正持有这个字符串的拷贝,而是与相对应的字符串共享这一空间。即——构造时不发生字符串的复制。同时,你也可以自由的移动这个视图,移动视图并不会移动原定的字符串。

  • 类的构造函数为protected

    通常protected这种情况,class内部没有static方法,其目的就是为了让类只能被继承,不能实例化当前类,只能实例化子类。

    【C++】构造函数protected的说明

4 An in-memory reliable byte stream

我的评价是不如LeetCode的模拟题

输入cmake --build build --target check0测试后,结果如下:

  • 用string实现ByteStream

check1

感觉check1的Reassembler就像一个大的模拟题,要考虑到其中的各种各样的情况,比较贴近应用,算法难度倒是不高

耗时:10h

A reasonable view has it that TCP implementations count as the most widely used nontrivial computer programs on the planet.

⋆Why am I doing this?

TCP robustness against reordering and duplication comes
from its ability to stitch arbitrary excerpts of the byte stream back into the original
stream. Implementing this in a discrete testable module will make handling incoming
segments easier.

The receiver must reassemble the segments into the contiguous stream of bytes that they started out as.
In this lab you’ll write the data structure that will be responsible for this reassembly: a Reassembler.

思路

首先理解这张图:

image-20230630165524261

代码逻辑:

  1. 先记录整个流的长度whole_length

    避免后续对data进行裁切后出现误差

  2. 预处理preProcess()

  • first_unassembled_index 即为 output.bytes_pushed()

  • 如果要插入的data全都在[first_unassembled_index, first_unacceptable_index)之外,则直接丢弃

  • ByteStream没多余容量,则不用讨论后续,无法再插入,直接关闭output流即可(不能判断可用空间=0时就关闭,因为可能还没有Read,不是一定插不进去)

  • 还要掐头去尾:data只是一部分超出[first_unassembled_index, first_unacceptable_index),需要进行截取,保留区间内的部分

  1. 处理新串data和map中已有的串的重叠问题process_overlapping()

    遍历map(reassembler_buffer)中的所有旧串

    分四种情况讨论:

    • 新插入的串包围原有的串

      直接**删掉[erase]**原有串

    • 新插入的串被原有的串包围

      直接返回

    • 新串屁股处与原有串有重叠

      截断新串

    • 新串头部与原有串有重叠

      截断新串

    注意更新first_index、data、bytes_in_Reassembler

    注意遍历同时删除[erase]的迭代器操作:C++ map遍历删除的正确写法

  2. 将“瘦身”过的新串插入reassembler_buffer中

  3. 遍历map(reassembler_buffer),将可以交付的串push到ByteStream(Writer)中,并**删掉[erase]**map中的原有串

    注意遍历同时删除[erase]的迭代器操作

    注意更新ffirst_unassembled_index、bytes_in_Reassembler

  4. 判断是否到了stream的末尾

    若流中的数据已经全部被接收,调用close()进行关闭

note

  • C++11: Default Member Initializers(类成员就地初始化)

    除了初始化列表外,在C++11中,标准还允许使用等号= 或者 花括号{} 进行就地的非静态成员变量初始化。

    如使用{}对map进行就地初始化

    image-20230630020051504
  • 遍历map的同时删除其中的元素

    C++ map遍历删除的正确写法

    it++返回了自增前的迭代器的一个临时拷贝

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    map<string,int> testMap;

    for(auto it = testMap.begin(); it != testMap.end();)
    {
    if(it->second == xxx)
    {
    testMap.erase(it++);
    }
    else
    {
    it++;
    }
    }
  • 使用gdb打断点调试程序

    官方给的教程:CS144 Debugging

    我自己记录的常用方法:GDB_skills

pass screenshot

check2