Blog was migrated from funwithbits.net.
From today on, I’ll try to blog about some interesting stuff once a month.
Blog was migrated from funwithbits.net.
From today on, I’ll try to blog about some interesting stuff once a month.
Ok, what I want here is to guide you through a Meltdown proof-of-concept that peeks into kernel data that cpu is supposed to protect. Unfortunately, that promise was broken and virtually all intel x86 cpu hosts are vulnerable until patched (Linux patchset is available and named kpti (kernel page table isolation)).
So far, we thought we were protected by cpu mechanisms because when you try to load kernel (sensitive) data, cpu triggers an exception which in turn is propagated by kernel to application as SIGSEGV, leading to application termination, look at this example:
1 2 3 4 5 6 7 8 9 10
So time went by, and we all thought we were safe. Recently it was figured out that speculative execution could be exploited to break that promise. It basically consists of making the cpu use the value stored in a kernel address before the cpu detects the access to kernel data is an invalid instruction, but it’s not that simple… look at this example:
1 2 3 4
It turns out that cpu may actually perform the addition of value found in kernel address and 10, but the result is definitely not returned to the program which again terminates with a SIGSEGV.
Now comes the BINGO, EUREKA! moment: A researcher figured out that whichever data (in example above kernel_data and byte) used in the instruction that procedes the invalid one (*kernel_address) will remain cached in the cpu data cache. Yes! CPU doesn’t invalidate data cached from an instruction which used the result (kernel data) of an invalid one (access to kernel data).
Well, data cache cannot directly be read. But we know that access to data cached is much faster than to uncached ones, so what we can do is to use the byte read from kernel as an index for an array of size 256 elements, each index representing a possible representation of a byte (0..255). For example:
1 2 3 4 5
That would only work if cache line size is 1, but it’s usually 64 for level 1, so we need to multiply array size and index by 64.
1 2 3 4 5 6
So if it happens that the instruction that accesses array using byte read from kernel is executed before cpu triggers exception for invalid access to kernel, it means cpu will bring array[kernel_data_as_index * cache_line_size] to cache.
Now we know that we can iterate through the 256 “cache lines” in array, and we’ll know that the one with the fastest access time is the one cached by cpu, and the “cache line” index is actually the byte read from kernel.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
It’s not hard to understand it. If you think of it, fastest_i is equal to the byte stored in kernel_address because program determined array[fastest_i * cache_line_size] is cached by cpu and only one place in array was cached previously by the instructions that accessed kernel and used its value to retrieve a byte from array.
In real life, it’s not that simple to implement because you need to clear data cache (clflush instruction can be used) before performing the test because you want to read more bytes and the data cache shouldn’t be polluted for array or wrong result is returned.
This is my assembly inline to clear cache line for a specific address:
1 2 3 4 5 6 7 8 9 10
So you’ll need to do that for each “cache line” in the array, the code now becomes this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
That’s not complete code because you’ll have to set up signal handler or use TSX (Transactional Synchronization Extensions) to mitigate SIGSEGV that occurs after the instruction to load kernel address retires (is fully executed). Follow example using TSX:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
There’s nothing much you can do with this code that only reads 1 byte from kernel space, so I’d suggest you to take a look at the code of my own project that exploits meltdown to check whether or not system is affected, follow link: https://github.com/raphaelsc/Am-I-affected-by-Meltdown I’d also recommend you to take a look at proof-of-concept of the researchers involved: https://github.com/IAIK/meltdown/
I’m currently studying OS development with MIT course 6.828 mostly for fun. I highly recommend it for everyone wanting to have a better understanding of how computer works. It will teach you how an operating system works like xv6 and you will also build your own OS. Check it out here: https://pdos.csail.mit.edu/6.828/2016/schedule.html
After you complete the first lab, you will better understand how hardware initialization is done by firmware BIOS, then how bootloader is loaded, and in turn the operating system.
Earlier IBM PCs couldn’t address more than 1MB of memory, and only the first 640k chunk was actually RAM. The 640k-1MB range was reserved for special use, like mapping devices like VGA display, 16-bit PCI devices, and BIOS ROM.
The layout is more or less as follow:
1 2 3 4 5 6 7 8 9 10 11 12
It gets more complex for 32-bit physical address space. It’s also amazing how the hardware engineers left holes in the physical address space of the machine for backward compatibility.
When computer is turned on, BIOS is mapped at that 64k region reserved to it and control is transferred over to it. I want to better understand how QEMU emulates that. GDB and strace will be my friends in this journey. The former will be used to step through BIOS instructions, and the latter to see what QEMU is doing internally, like memory maps and files opened.
one of the most interesting things in strace output is the following:
1 2 3 4 5 6
It’s very interesting that file storing BIOS is 256k in size, even though area reserved for it is 64k. I suppose that only some part of it is actually mapped. There are also some interesting files such as /usr/share/qemu/kvmvapic.bin and /usr/share/qemu/vgabios-stdvga.bin which are opened and read. They all will be used to emulate the layout figure shown above.
QEMU also reports the following through ‘info roms’:
1 2 3 4 5
GDB can remotely connect to QEMU instance which waits for it, and that’s what I do here. Let’s get started…
QEMU starts executing at physical address 0x000FF000 and in real mode, which is at the very top of the area reserved for BIOS. CS and IP registers are 0xF000 and 0xFFF0, respectively. CS and IP were both used for addressing memory because registers were only 16 bits, thus the segmented addressing mode which multiples segment by 16 and adds offset. So PC is able to address up to 1MB of memory. The problem is that different values for segment and offset may refer to same physical memory which may easily lead to bugs. That was later addressed in protected mode which had bigger registers and MMU at its disposal.
We have a very basic idea of what BIOS will accomplish which is memory check, device initialization, and lately find a bootable sector to load into a predefined address (0x7C00) for which it will transfer control to. The purpose of this article is to go into the very specific details, because we already know the higher-level overview.
I’ll only show the most important instructions because this article would get long and boring otherwise. On top of each instruction, there will be a comment to explain it and also possibly share something interesting.
http://web.archive.org/web/20040304063834/http://members.iweb.net.au:80/~pstorr/pcbook/book2/ioassign.htm is used as a reference for I/O addresses. I’m happy wayback machine allows me to access that great website again.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
I’m really tired now, and the article will also be boring if I keep going. I think it’s better to stop at this point. As we know, the BIOS will also keep initializing things and afterwards will load boot sector into 0x7c00 and start executing it. I hope you had lots of fun reading this article. My main idea is to have people interested in operating system engineering. I’ll also try to keep posting as I go through the OS course offered by MIT.
Bye for now!
There’s some magic about classical music that helps me as a programmer. Well, I’m definitely not the best programmer out there, so when I find it hard to come up with a creative solution, I listen to classical music over and over again. I leave my gratitude here for Chopin, Mozart, Dvorak, Paganini, Vivaldi, Beethoven and so many others composers. I’m so grateful for all those pieces you guys have left behind. When I’m going through a creativity crisis, I listen to their songs. I also like to listen to them while working. It can be said that we all need to be in a good mood to produce quality work. That’s another thing that classical music plays an important role in my life. Whatever my mood is, I listen to classical music because it impacts my mind in a good manner. It’s almost like an infinite potion of inspiration that you can drink from.
People have different tastes, so I was thinking if your favorite genre can have the same effect on your life. Classical rock, instrumental rock, etc, could work too. But I don’t think shitty, noisy pop music will do it. I actually don’t think I can find the same level of magic in Beethoven symphonies, Vivaldi’s four seasons, Paganini’s caprices, etc, in some other genre. The complexity of the melody, the depth, the harmony in those musics are outstanding. I’m having a really hard time here to describe it, i.e. how classical music impacts almost every single aspect of my life. It helps me boosting my creativity and also keeping focused on the most important things, it makes me happy when I’m sad, and also the other way around. Yes, the latter is true, have you ever tried listening to Chopin and Beethoven’s sonatas?
If you aren’t into classical music, I leave here my invitation for you to try it out. I will help you get started by leaving some links below to some musics from the composers I like the most. These are just the tip of the iceberg though.
I really hope you guys will give classical music a chance. I didn’t listen to it when I was a teenager. Now I love it, it’s my favorite genre by far. Anyone out there that share the same feelings? Not necessarily due to classical music, but whichever genre you like the most.
What if I tell you that there’s a framework out there that allows you to write servers that are as fast as the fastest man alive? Let me introduce you to Seastar. It’s a framework written in modern C++ that provides its user with a set of tools to extract the most from modern hardware.
Look at the chart below which compares Seastar memcached vs stock memcached:
And it’s not only throughput. It has also proven to improve P99 latency considerably. Other example of applications that benefit from Seastar are ScyllaDB, a distributed database compatible with Apache Cassandra, and Pedis, a drop-in replacement for Redis.
Let’s go through some of its features and I expect that you will understand its power by the end of the list. Here we go:
1. One of its most interesting features is the shared-nothing architecture. What exactly is that? Basically, there will be only one thread for each core (or shard in Seastar terminology) available to the application. And it also means that each thread will have its own set of resources (files, memory, etc) that will not be shared. By doing that, we eliminate the need to use locks because resources are no longer shared by threads. Communication between threads must be done explicitly through message passing, or else, we would need some locking mechanism.
: In seastar, a shard is a thread assigned to a CPU core that acts like an isolated machine.
A common multi-threaded application usually looks like the left picture because you have threads accessing shared resources, whereas a Seastar application will look like the right picture because of its shared-nothing design, look:
2. Each Seastar thread will have its own scheduler for small asynchronous tasks (usually stored in std::function), which is called The Reactor. It’s important to keep in mind that all tasks should be asynchronous. That’s because if a thread blocks (waiting for I/O, for example), the reactor will not be able to run other tasks waiting to run.
Let me throw at you another example. What happens if a syscall is called which involves blocking the calling thread until the requested resource (for example, sys_read) is satisfied? The CPU would sit idle while waiting for I/O. From the perspective of a server, it would mean not handling any requests. So when you’re writing code for Seastar, you must make sure that you only use asynchronous API either provided by Linux or Seastar. All I/O in Seastar is done through asynchronous mechanisms provided by Linux such as aio.
Seastar API is very well documented, and it can be found here: http://docs.seastar-project.org/master/index.html
3. Because of the issue mentioned above, Seastar needs some mechanism to make it easier the task of writing fully-asynchronous code. And that’s done using the concept of futures and promises.
Let me explain how future is used in Seastar with code. Let’s say that you want to call a function to sleep for 1 second. Usually, you would only call a sleep function with 1 as parameter, and then the calling thread will sleep for 1 second and continue from when it left off. In Seastar, you shouldn’t block the current thread due to performance reasons, but you can block the task (also known as fiber). So you’d need to call a Seastar function which promises to wake up the task after 1 second.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
All functions that need to wait for something (like data from disk or network, time, etc) return a future type. Those futures can be consumed using future::then(), which takes a function that will run when the time has come. In the example above, sleep() returns a future type, which will be waited using future::then(). So the program above should print ‘Done.’ after 1 second. The chain of functions (also known as continuations) can grow as you want. So after the print, the program could sleep again for N seconds, write to a file, send a command over the network, whatever you want, all asynchronously! Please take a look here to know more about futures and promises in Seastar.
The list is getting long, so I’ll tell you quickly what else Seastar provides:
First of all, you should fork the project, which can be found here. Take a look at the README file for instructions on how to install deps and compile the project.
If you’re interested in knowing more about Seastar, I’d advise you to read this tutorial written by Nadav Har'El and Avi Kivity. If you’re willing to delve into some Seastar apps, please go to: https://github.com/scylladb/seastar/tree/master/apps
If you need help, send an e-mail to the project’s mailing list: firstname.lastname@example.org
Seastar is very complex and it’s very hard to cover many of its aspects in a single article. At least, I hope I succeeded to help people understand what Seastar is about and how to get started. I intend to write more articles about it in the future covering more real-world examples. For example, how to write a simple server.
Thank you for your time and stay tuned!
Prerequisite: basic C++ and game programming knowledge
When you start any game, you expect to see a loading screen, followed by the main menu which has a button that allows you to play the game. When you start playing the game, it’s also expected that you’ll be able to go back to main menu and possibly pause and resume the game. All these different stages of the game are known as game states.
Handling game states is a very difficult task, especially to newbies to game programming like myself. Today, I was looking for an efficient way to switch back and forth between all available states of my simple game.
The simplest way to do it would be using a switch statement, as follow:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
That’s indeed very simple, but it would be a nightmare to maintain this code when the number of states increases considerably. It turned out that a Finite State Machine (FSM) is exactly what I was looking for. It’s said that FSM is an abstract machine that can be in exactly one of a finite number of states at any given time.
To implement a state machine that handle different types of game states, I took advantage of polymorphism. Each game state will derive from an abstract class called game_state; follow its definition:
1 2 3 4 5 6
The first two methods will be used for loading and cleaning the game state, respectively. game_state::update() will be used for a given state to react to user’s input and possibly switch to another state. For example, when an user clicks on play button, the state machine will switch from menu to play state.
Now our state machine will be able to work with all different types of game states in the same way. To make the transition between stages more efficient, the machine will work in the same way a stack does. For example, a new state will be pushed to the back of container storing the game states. And more importantly, the active state is the one that was last pushed to the container. That’s how my implementation of game state machine turned out:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Note that game_state_machine::update() will only call update on behalf of the active state, and that’s essential for the machine to work as expected.
I showed the implementation of abstract class for game state, but it’s also important to understand how an actual game state could be implemented by deriving from it. Check it out:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Very easy, no?! If we’re in play state, and want to go back to menu, all we need to do is to call game_state_machine::pop().
This was the most efficient way I found to handle game states in my own game. If you know a better way to do it, please leave a comment!
PS: the comment section only shows up when you click on the blog post’s title.
Hi everyone. I’m really glad that I finally launched this blog. Fun With Bits.
Special thanks goes to Peteris Krumins who shared some invaluable insights about blogging.
TL;DR: Bla bla bla about who I am. In this blog, I’ll talk about software development techniques, low-level hacks, game development, kernel, among other interesting topics I feel it’s worth sharing with my readers.
I don’t expect people to know me prior to reading my blog, so I’m going to introduce myself.
I think that my father (unfortunately, he’s no longer around us) is the main reason behind my interest in computers. When I was about 10 years old, he taught me how to use commands to interact with MS DOS. I remember the most important commands were to edit texts, execute programs, and navigate through the file system. I don’t know how many times I messed up with the operating system, making impossible to boot it. I had a lot of fun doing it though. After that, I was able to create my own websites using HTML and CSS. I didn’t even know what programming actually was at that time. I was just curious and wanted to do interesting things with my computer.
I actually started with programming after I got interested in creating my own alternative server for a MMORPG called Tibia. Of course, I didn’t create the server, but the experience made me learn how to host a server, manage my website and consequently a database which stored players' data, and also how to write scripts using LUA programming language for custom systems, like quests and spells.
At that time, I wasn’t even thinking of computer science as a career. I was just having fun with bits. Pun intended :-) But I changed my mind and joined a local university for a computer science degree. It helped me a lot because I wanted to be the best programmer among my peers. I worked really hard. A friend of mine a.k.a. kov introduced me to the open source world. He told me about the job opportunities I could get if I started helping relevant open source projects out there. IIRC, Linux kernel was the first open source project I contributed to. The change was to slightly improve the PID allocation code. I will not lie. It was extremely hard to find something to do, but the experience taught me a lot. I was really obsessed with kernel. I often found myself whispering the word ‘kernel’ while showing or walking down the streets. I wanted to better understand how computers worked, from boot to application initialization.
As I was gaining more experience, I contributed to other projects until I got my first job opportunity at an israeli startup called Cloudius Systems. It was working on creation of a project called OSv. In OSv, I worked with file systems. Mostly on improving ZFS support. I’m working for the same company, but now on creation of a distributed database called ScyllaDB.
I think I’m considerably better than myself of 5 years ago, but there’s a long way to go until I can say I’m really good with computers. By the time being, I’ll keep repeating this ZEN poem to myself: “To follow the path, look to the master, follow the master, walk with the master, see through the master, become the master.”
I’ll do my best to share interesting stuff with all of you.