The friendship between Haskell and C
If ever there have been two programming languages stated to be at odds with each other, it could be Haskell and C. However this isn’t so true because it appears; they will play fairly properly with each other. Haskell’s overseas perform interface lets us write Haskell code that makes use of libraries of different languages — notably, C.
As a quick introduction to how Haskell FFI works, I’ll be speaking about my memfd package deal, which is accessible on Hackage. This text is an element 1 of two, laying out ideas and motivation. Half 2 will go over the memfd
package deal, the Linux API that it makes use of, and the way their pleasant relationship is expressed in Haskell.
If you wish to help my Haskell open supply work, one wonderful means is to subscribe to this publication.
The phrase ‘file’ arises as a result of information are sometimes used for persistent storage like papers in a cupboard. When a course of opens a file, it receives a file descriptor (FD). Conceptually, this can be a reference to the file; actually, it’s an integer. This integer — solely inside the context of the method that it belongs to, and solely from the time the FD is opened till the time it’s closed — refers to a selected file. The method makes use of the FD to read from or write to the file.
However we should get previous storage cupboards and study to think about a file extra abstractly as an inter-process communication channel. An FD is an integer that, inside the context of a course of, abstractly identifies some useful resource, and there are various sorts of assets. Some examples of file descriptors:
-
The standard input, output, and error streams (
stdin
,stdout
, andstderr
) that each course of implicitly begins with as FDs 0, 1, and a pair of. Relying on how the method was initialized, every of those streams may signify a persistent file, or it’d signify one other course of. -
When a community consumer opens a connection to a server, the server course of receives an FD representing the socket that it makes use of to communicate with the consumer.
-
When a Wayland graphical software begins, it first connects to a UNIX-domain socket to provoke contact with the graphics server, receiving a socket FD it should use to ship messages resembling “a brand new body is able to show.”
-
The subsequent factor a Wayland consumer does is open a file (creating one other FD) into which it should write the picture information that it needs to show on display screen.
-
It then sends that FD over the socket to the Wayland server. This leads to the creation of yet one more FD inside the Wayland server course of. Each FDs check with the identical file, thus establishing one other type of inter-process communication (one that’s sooner than the socket for transmitting giant quantities of graphical information).
Sockets make the terminology awkward, as a result of a socket will not be a file, however an integer that we use to establish it to it’s known as a file descriptor anyway. A socket has a file descriptor that doesn’t correspond to a file. This state of affairs is alleged to be fairly elegant, and maybe it’s, although the nomenclature is simply painful.
My final three examples of file descriptors all pertained to Wayland as a result of that was my motivation for writing the memfd package deal, the dialogue of which is forthcoming.
Within the System.Posix.Types module of the base package deal, we discover the next definition:
newtype Fd = Fd CInt
A file descriptor is, actually, solely a quantity. It is a newtype for CInt, which is named a “C Int” as a result of it corresponds to the “int” sort in C. That is how we will begin to see that Haskell and C are mates; Haskell’s normal library has definitions like this to allow us to discuss C utilizing C’s personal phrases.
Facet be aware: A Handle is said to however not fairly the identical as a FD. The Deal with
sort is what you’ll usually be utilizing to put in writing cross-platform code; it’s in some methods extra summary. The Fd
sort is what we use for Unix-specific work. If you could flip an Fd
right into a Deal with
, you should use fdToHandle.
The truth that there may be multiple sort of file system rose to the eye of pc laypersons with the proliferation of floppy disks. If you purchased a pack within the retailer, its label indicated whether or not it was formatted for PC or Mac (no person cares concerning the Linux customers). The magnetic disk didn’t come from the shop in a really clean state; it was arrange with the preliminary construction that the pc must see that the disk is clean. Since Microsoft and Apple selected totally different file programs, barely totally different disks have been produced for every market section. Linux customers thought this was foolish, why didn’t regular individuals simply format the disk for themself once they received it residence like we did, why are they outdoors having enjoyable whereas we format our disks, and why did they not invite us.
The distinctions between PC/Mac file programs are considerably trivial, merely differing in implementation particulars. The general function of them is identical: to rearrange information on the disk. If that is all a file system means to you, it’s time to broaden your thoughts to embody different kinds:
-
tmpfs appears to be like like a traditional disk file system, nevertheless it’s a mirage; this technique is backed by risky storage (RAM) reasonably than persistent storage. If you could write a file briefly however don’t want it file to persist indefinitely,
tmpfs
is acceptable as a result of it’s sooner. If you happen to want the file to not persist indefinitely, tmpfs is acceptable as a result of should you neglect to wash up your rubbish, it should all the time get cleaned up routinely subsequent time the system restarts. -
sshfs additionally appears to be like like a traditional file system, nevertheless it’s backed by one other file system on one other pc. If you should use SSH to entry a distant pc, you should use
sshfs
to map the distant pc’s information into your individual system’s listing tree to faux like their stuff is yours. It’s fairly neat. -
procfs isn’t a general-purpose storage system in any respect, however reasonably a way of studying details about your pc. It seems in most programs because the
/proc
listing, which incorporates largely a bunch of textual content information. For instance,/proc/meminfo
reveals how a lot RAM you could have, and/proc/<course of id>/environ
reveals all of the atmosphere variables for a operating course of. I like to recommend poking round in there a while, as a result of you will discover an enchanting quantity of stuff.
Every thing I’ve to this point described as a file has a file path. Once we create a file, we create it inside a listing and with a reputation. The listing and identify collectively represent an deal with by which we will open the file later. The affiliation of a file, a reputation, and a listing is named a laborious hyperlink.
A considerably less-considered notion is {that a} file can have multiple laborious hyperlink. You need to use the ln command-line utility to provide a file extra laborious hyperlinks. Doing so doesn’t copy the file, nor does it create a scenario whereby one of many paths is the true one and the opposite merely a pointer (such a pointer is named a symbolic hyperlink). The file is just linked into the file system in multiple place.
The affiliation of a file, a reputation, and a listing is named a laborious hyperlink.
One strategy to delete a file is to make use of unlink. (The rm utility is extra generally taught as a result of it has extra options, together with the power to delete directories.) However this doesn’t essentially outcome within the destruction of the file; it solely removes a tough hyperlink. If there are different laborious hyperlinks, then the file nonetheless exists. Solely as soon as a file’s laborious hyperlink rely reaches 0 is the file actually gone.
No, I lied — A file can exist with none laborious hyperlinks in any respect. I’ll give a fast demonstration. The next requires base, directory, and filepath.
import Prelude
import System.Listing
import System.FilePath
import System.IO
major = do
dir <- getTemporaryDirectory -- 1
let file = dir </> "demo.txt"
h <- openFile file ReadWriteMode -- 2
removeFile file
begin <- hGetPosn h -- 3
hPutStrLn h "Hiya!"
hSetPosn begin -- 4
hGetContents h >>= putStrLn
-
Lookup the system’s default momentary file location and assemble a file path.
-
Create a file, after which instantly unlink it.
-
Mark the place at the beginning of the file, then write a message.
-
Reset the file deal with to the begin to learn again the message and print it.
Thus we see that one can go on utilizing a file even after it has been fully faraway from the file system. So the true situation below which the working system can garbage-collect a file is when it has no remaining laborious hyperlinks and no open file descriptors.
The primary motive it issues to grasp that is to grasp the significance of not writing software program with resource leaks. If in case you have a long-running course of that forgets to shut a Handle or two, you may assume: How massive a deal might that presumably be?
If you happen to have been anticipating that file to get deleted, what it might price is nevertheless a lot area that file takes up. So long as you could have an open Deal with
, the working system has to maintain the whole content material of that file till your course of ends.
That may be a good motive to make use of ResourceT in any scenario the place you’re coping with a file. (That is the main focus of chapter 1 of Sockets and Pipes.)
The second motive that nameless information are fascinating is that they’re not all the time accidents! Typically that is what you need. That is the case for the Wayland instance described earlier. A Wayland consumer creates a file to retailer its graphics after which sends the file descriptor to the Wayland server. That file now capabilities as a shared reminiscence area that the consumer writes to and the server reads from.
There isn’t any motive for such a file to ever be laborious linked!
A standard sample for Wayland purposes is to do what I did within the foolish instance code earlier: Create a file after which instantly delete it. However that is pointless, as a result of we will do higher.
I’ll discuss how partly two.