This post is the beginning of a series of posts regarding some of the more interesting issues I’ve encountered while working on the Streams Standard.
In the Streams Standard we have the concept of readable streams, which are an abstraction on top of the lower-level underlying sources. In an abstract sense an underlying source is “where the chunks of data come from.” The most basic underlying sources are things like files or HTTP connections. (More complicated ones could be e.g. an underlying source that randomly generates data in-memory for test purposes, or one that synthesizes data from multiple concrete locations.) These basic underlying sources are concerned with direct production of bytes.
The major goal of the Streams Standard is to provide an efficient abstraction specifically for I/O. Thus, to design a suitable readable stream abstraction, we can’t just think about general concepts of reactivity or async iterables or observables. We need to dig deeper into how, exactly, the underlying sources will work. Otherwise we might find ourselves scrambling to reform the API at the last minute when confronted with real-world implementation challenges. (Oops.)
The current revision of the standard describes underlying sources as belonging to two broad categories: push sources, where data is constantly flowing in, and pull sources, where you specifically request it. The prototypal examples of these categories are TCP sockets and file descriptors. Once a TCP connection is open, the remote server will begin pushing data to you. Whereas, with a file, until you ask the OS to do a read, no I/O happens.
I ended up writing the following posts: