Skip to content

Ray is coding Posts

Writing a NES emulator in Javascript: Part 2 (starting on the CPU)

Let’s start with the most obvious choice: The CPU. The NES’s 2A03 CPU is a modified 6502 processor. There’s an interesting story behind the choice of processor:

“The Nintendo core processor was a 6502 designed with the patented technology scraped off, We actually skimmed off the top of the chip inside of it to see what it was, and it was exactly a 6502. We looked at where we had the patents and they had gone in and deleted the circuitry where our patents were.”

Although there were changes, the NES microprocessor ran 99% of the 6502 instruction set. “Some things didn’t work quite right or took extra cycles,”

One important difference is the removal of the decimal mode, which was patented technology. The flag still exists, but is permanently disabled.

The focus of this implementation was to get code working, and so no effort is being made to optimise anything. For instance, some of the bitwise function are implemented using string operations and will be switched over later.

In order to test against a reference, I’m using nestest.nes which the docs describe as follows:

This here is a pretty much all inclusive test suite for a NES CPU. It was designed to test almost every combination of flags, instructions, and registers. Some of these tests are very difficult, and so far, Nesten and Nesticle failed it. Nintendulator passes, as does a real NES (naturally). I haven’t tested it with any more emulators yet.

I attempted to check the states of all flags after most instructions. For example, CPY and CMP shouldn’t affect the overflow flag, while SBC and ADC should. Likewise, all forms of wrapping ARE tested for- zeropage wrapping being the tests most emulators fail.

Starting the code

The log output for nestest.nes from Nintendulator is used as a reference. The CPU is represented by a class that contains the processor logic and attributes for the various parts of the CPU:

 

class CPU {

  constructor() {
    this.memory = new Memory;
    this.registers = {PC: 0, SP: 0, P:0, A:0, X: 0, Y: 0};
    this.flags = {carry: false, zero: false,
      interrupt_disable: true, decimal_mode: false,
      break_command: true, overflow: false, negative: false};
    this.running = false;
    this.cycles = 0;
    this.logger = new Logger;
    this.stack = new Array;
    this.log("CPU Initialized");
  }

Memory access is split out into its own class which contains functionality for getting and setting memory. The NES includes memory mapped I/O Implementing memory mapping in the CPU involves mapping particular addresses to different hardware elements. The fetch() function of the memory class accomplishes the memory mapping for retrieval detailed here:

 

  fetch(addr) {
    if(addr >= 0 && addr <= 0x7FF) {
      // 0000-00FF Zero-paged region
      // 0100-01FF - Stack Memory
      return this.ram[addr];
    } else if (addr >= 0x800 && addr < 0x0FFF) {
      // mirrors RAM
      return this.ram[addr-0x800];
    } else if (addr >= 0x1000 && addr < 0x17FF) {
      // mirrors RAM
      return this.ram[addr-0x1000];
    } else if (addr >= 0x1800 && addr < 0x1FFF) {
      // mirrors RAM
      return this.ram[addr-0x1800];
    } else if (addr >= 0x2000 && addr < 0x2007) {
      // NES PPU registers
    } else if (addr >= 0x2008 && addr < 0x3FFF) {
      // Mirrors of $2000-2007
    } else if (addr >= 0x4000 && addr < 0x4017) {
      // NES APU and I/O registers
    } else if (addr >= 0x4018 && addr < 0x401F) {
      // APU and I/O functionality
    } else if (addr > 0x4020 && addr < 0xFFFF) {
      if (addr >= 0xC000 && addr < 0xFFFF) {
        return this.rom[(addr-0xC000)];
      }
    }

  }

We'll get to how the ROM part works in the next article.

I opted to implement the opcodes as a giant switch() statement, which while not optimal will get the job done. An alternative approach I could've used was the one Imran Nezar used of creating an object containing all the operations as values, with the opcodes as the keys.:

  execute() {

    this.log("Beginning execution at " + this.registers.PC);
    this.running = true;

    while(this.running == true) {

      let opcode = this.memory.fetch(this.registers.PC);
      switch(opcode) {

      case ops.NOOP:
        let nb1 = this.memory.fetch(this.registers.PC+1);
        let nb2 = this.memory.fetch(this.registers.PC+2);
        let j = utility.Utility.merge_bytes(nb1, nb2);
        this.log("JSR " + j, this.registers.PC);
        let bytes = utility.Utility.split_byte(this.registers.PC+3);
        this.stack.push(bytes[0]);
        this.stack.push(bytes[1]);
        this.registers.SP -= 2;
        this.registers.PC = j;
      break;
      case ops.TSX:
          [...]
      break;
      case ops.TXS:
          [...]
      break;
      case ops.TYA:
          [...]
      break;

In the next article I'll talk a little about how to load a ROM file and how to get the code booting.

Writing a NES emulator in JavaScript: Part 1

Warning: these may be some long articles

Introduction & Inspiration

I’ve always wanted to write an emulator, I’ve found the concept fascinating, and I thought it was finally time to complete a emulation project. I went with the NES for the console, and JavaScript as the language. The reason being the NES is well documented, there are functional emulators out there with the ability to provide debugging output, and I owned a NES myself.

 

About Emulators

If you’re unfamiliar with how emulators work, aside from that they let you play videogames on your computer, the basic explanation is that the software you’re using is a code version of the original concept. Instead of the NES hardware reading and acting on the instructions contained in the ROM code on the cartridge, a program handles the CPU instructions, creates its own RAM, and maps graphical output instructions to visual output on your screen. This is incredibly complex, and hard to get the hang of, but it’s also a lot of fun, and provides some interesting problems to solve.

The height of technological progress

For instance, the NES hardware is a little idiosyncratic, and often restrictive in what it allowed programmers to be able to do. As a result, NES game programmers learned to exploit aspects of the hardware in very interesting ways, relying on the exact hardware implementation’s quirks to produce the result. This could mean that a “perfect” emulator that runs each component exactly may not run games properly.

The Hardware

Let’s take a look at the NES hardware. The NESdev wiki has been an amazing resource so far, and it’s a great place to start from. In these really early stages, gathering an idea of the hardware to be implemented and a rough sketch of how things fit together would be useful.

The goal of the first phase was to get a rough outline of the code and get a few instructions working. In order to do this a few things need completing:

  1. Basic CPU emulation What flags and registers does the CPU use? How does it calculate clock cycles for synchronization?
  2. Memory Mapping How do different parts of hardware map to addresses in memory? (this is the primary way that the CPU communicates with other hardware parts)
  3. Basic ROM Loading. Where is the executable data stored in the data from the cartridge?
  4. Ability to debug instructions Let’s be real, I have no clue what I’m doing with a lot of this, I’m going to need something to check against when I get stuck.

The CPU

The NES CPU core is based on the 6502 processor and runs at approximately 1.79 MHz. Reading this PDF helped immensely. The 6502 has the following registers:

  • A Accumulator Register
  • X, Y Index Registers
  • SP Stack Pointer
  • P Program Counter
  • S Status Flag

At its most simplest, a CPU is only really doing a few simple things. We fetch a byte from memory, decode the opcode, and take some action based on it. And then we do it again. And again. Until we have a reason not to. The CPU is where I’m going to start, and in the next post I’ll talk a bit about how that’s implemented.

Conway’s Game of Life (JavaScript)

The Game of Life, also known simply as Life, is a cellular automaton devised by the British mathematician John Horton Conway in 1970.

.

Video recording of simulation, live demo can be found here.

Conway’s Game of Life is awesome for being the simplest example of artifical life. It implements a rule-based system for determining whether elements in a grid should be alive or dead, with a very simple rule system. But like with DNA or Ant Colonies (read about it!) a simple set of rules can result in an amazing array of diverse behaviour.

Conway’s Game of Life isn’t hard to implement, and seems like a great target for optimisation. The logic is pretty simple, we need a grid of cells. A lot of people
use two-dimensional arrays for this, but I opted for a 1D array where the rows/columns are calculated through a translation function. Each cell contains a value of 0 or 1 for dead or alive.

The Code

class Grid {
  constructor() {
    this.grid = new Uint8Array(GRID_SIZE);
    this.next = new Uint8Array(GRID_SIZE);
    this.init();
  }

  dump() {
    return this.next;
  }
  swap() {
    this.grid = this.next;
    this.next = new Uint8Array(GRID_SIZE);
  }
  at(x, y) {
    return y * GRID_WIDTH + x;
  }
  init() {
    this.grid = fast_random(GRID_SIZE);
  }

The first part of the Grid class, the main meat of the code, is pretty simple. The Grid stores a current grid and a next grid, which is updated and then swapped with the current one during animation cycles. When the Grid is initialized it grabs GRID_SIZE bits using the fast_random function (more on this later).

Survival Function

For each cell within the “grid”, we need to calculate the number of neighbours to figure out its fate in the next generation. The rules for the game are as follows:

Any live cell with fewer than two live neighbours dies, as if caused by underpopulation.
Any live cell with two or three live neighbours lives on to the next generation.
Any live cell with more than three live neighbours dies, as if by overpopulation.
Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

 

  update() {
    for (let x = 0; x < GRID_WIDTH; x++) {
      for (let y = 0; y < GRID_WIDTH; y++) {
        let score = 0;
        const CURRENT_LOCATION = this.at(x, y);
        for (let i = -1; i < 2; i++) {
          for (let j = -1; j < 2; j++) {
            if (i == 0 && j == 0) {
              // skip self
            } else {
              score += this.grid[this.at(x + i, y + j) % GRID_SIZE];
            }
          }
        }

Here we’re just looping over every cell of the grid, and for each of the cells adding up the 1s in the surrounding 8 cells. From this we can figure out whether it’s going to be alive or dead based on its current state and its neighbour count:

        switch (this.grid[CURRENT_LOCATION]) {
          case 0:
            if (score == 3) {
              this.next[CURRENT_LOCATION] = 1;
            } else {
              this.next[CURRENT_LOCATION] = 0;
            }
            break;
          case 1:
            if (score > 3 || score < 2) {
              this.next[CURRENT_LOCATION] = 0;
            } else {
              this.next[CURRENT_LOCATION] = 1;
            }
            break;
        }
      }

Pretty simple!

Rendering the Output

I wanted to write several potential renderers to test what gave the best performance. Each simply takes a grid as an array and draws it.

const renderer = RectangleRenderer;
render = new renderer();
let grid = new Grid();

function iterate() {
  grid.update();
  render.render(grid.dump());
  grid.swap();
  requestAnimationFrame(iterate);
}

var ctx;
$(document).ready(function() {
ctx = $("#game-view")[0].getContext("2d");
iterate();
});

Initial Optimisations

Ok so, one problem. With a 256×256 grid the initial start-up takes up to 9 seconds. Some quick JS Profiling in chrome seemed to indicate that the culprit was the init() function.

The init() function was seeding the grid by iterating over the entire grid’s X/Y coordinates and filling each cell with a random 0 or 1. But in order to do this, the Math.random() function was being called for each cell (meaning it needs to be called 65535 times) as the following:

  cell = Math.floor(Math.random()*2)

As it turns out this is extremely wasteful. In theory, each call to Math.random() could be used to supply up to 32 unique bits per call, which would reduce the number of calls significantly. After implementation, the init() function began to run almost instantaneously:

function random_byte() {
  let el = Math.round(Math.random() * (Number.MAX_SAFE_INTEGER/2) + Number.MAX_SAFE_INTEGER/2);
  let bits = [];
  static_counter += 1;

  while (el > 1) {
    let bit = Math.floor(el % 2);
    bits.push(bit);
    el = el / 2;
  }

  return bits;
}

function fast_random(size) {
  let array = [];

  while (array.length < size) {
    let c = random_byte();
    array = array.concat(c);
  }

  return array.slice(0,size);
}

With this function added, the initial delay disappeared entirely for a 256x256 grid. In the next article I'm going to look at some of the algorithms used in speeding up the Game of Life as an implementation level.

Ruby’s Array#map_with_index and each_with_index in Javascript

#each_with_index and #map_with_index are commonly used Ruby features that can be easily achieved in JavaScript

Each with index

In Ruby we can do this:

x = ["a","b","c","d","e"]

x.each_with_index { |el,n| puts "#{n} is #{el}" }

Output:

0 is a
1 is b
2 is c
3 is d
4 is e

In Javascript it’s possible to do something very similar. The documentation for JavaScript’s Array.forEach function states that the three arguments passed to the callback function are:

callback is invoked with three arguments:

the element value
the element index
the array being traversed

Therefore we can do this:

let array = ["a","b","c","d","e","f"];

array.forEach((x,n) => (console.log(x + " is " + n)));
a is 0
b is 1
c is 2
d is 3
e is 4
f is 5

Map with index

In Ruby it’s possible to do this:

x = ["a","b","c","d","e","f"]

z = x.map.with_index { |el,n| (el.ord + n).chr }

puts z

Output:

a
c
e
g
i
k

Similar to with forEach, Map has the following arguments passed to its function argument:

currentValue

The current element being processed in the array.

indexOptional

The index of the current element being processed in the array.

arrayOptional

The array map was called upon.

Which makes it possible to do this:

let array = ["a","b","c","d","e","f"];

let new_array = array.map((x,n) => String.fromCharCode(x.charCodeAt(0)+n));

console.log(new_array);

Output

[ 'a', 'c', 'e', 'g', 'i', 'k' ]

Pwnable.kr CTF writeup: level 6 – random

This one turned out to be surprisingly easy. What’s required is an understanding of random number generators, and how they’re not really random. Implementations of random number generators are known as PRNGs, acknowledging that they are pseudo random number generators. They use various tricks in order to produce streams of not obviously connected numbers, but without an understanding of how they’re meant to work it can be easy to produce a problem similar to the one described in this CTF.

Code

#include

int main(){
unsigned int random;
random = rand(); // random value!

unsigned int key=0;
scanf("%d", &key);

if( (key ^ random) == 0xdeadbeef ){
printf("Good!\n");
system("/bin/cat flag");
return 0;
}

printf("Wrong, maybe you should try 2^32 cases.\n");
return 0;
}

Pretty simple to understand, we have two variables, random and key. A decimal value is read into key through scanf, and rand() assigns a “random” number to the variable ‘random’. We then XOR the values and if the result is 0xdeadbeef we give up the flag. In theory, this should be hard to beat. Wouldn’t we have to keep trying values until we happen to get the right one? And with such a large problem space, that could take a very long time.

Key Insight

The most important thing to recognise is that this is incorrect use of the rand() function. Using PRNG correctly involves “seeding” the generator with a value that changes frequently, a common tactic is to use the current timestamp supplied to srand (there are some issues with this too, see if you can figure them out). This leads to predictable behaviour, which becomes a problem when knowing the random value allows us to exploit the behaviour. A list of historical examples of where this has been used in real-world attacks can be found on wikipedia’s page on random number generator attacks

Since the rand() call isn’t seeded, it’s fairly trivial to check the value of a few runs of the program in GDB. Setting a breakpoint for the next line after the call to rand() makes it easy to observe what’s returned from the function into the RAX register (the one used to store return values traditionally). A few runs through and it’s easy to see that this value doesn’t change. Due to failure to seed the PRNG it’s returning the exact same value each time, which happens to be 0x6b8b4567. Now we have the following simple equation to solve: something XOR 0x6b8b4567 = 0xdeadbeef. The value we’re looking for will be 0xdeadbeef XOR 0x6b8b4567, which ends up being = 0xb526fb88 which is 3039230856. Let’s try it:

random@ubuntu:~$ ./random 
3039230856
Good!
Mommy, I thought libc random is unpredictable...

Pwnable.kr CTF walkthrough: level 1 fd

Pwnable.kr contains a set of CTF challenges ranging from beginner to advanced. I’ve been working through them to increase my understanding of binary exploitation and Reverse Engineering, and I’ll be posting walkthroughs for ones I’ve solved

Logging in to the machine specified, we see a banner and have a bash shell, and listing the contents of the homedir we can see we have three files, a binary, a C source file (of the binary) and the flag file.

$ ssh fd@pwnable.kr -p2222
fd@pwnable.kr's password: 
 ____  __    __  ____    ____  ____   _        ___      __  _  ____  
|    \|  |__|  ||    \  /    ||    \ | |      /  _]    |  |/ ]|    \ 
|  o  )  |  |  ||  _  ||  o  ||  o  )| |     /  [_     |  ' / |  D  )
|   _/|  |  |  ||  |  ||     ||     || |___ |    _]    |    \ |    / 
|  |  |  `  '  ||  |  ||  _  ||  O  ||     ||   [_  __ |     \|    \ 
|  |   \      / |  |  ||  |  ||     ||     ||     ||  ||  .  ||  .  \
|__|    \_/\_/  |__|__||__|__||_____||_____||_____||__||__|\_||__|\_|
                                                                     
- Site admin : daehee87.kr@gmail.com
Last login: Sat Feb 17 06:54:22 2018 from 121.125.207.90
fd@ubuntu:~$ ls
fd  fd.c  flag
fd@ubuntu:~$ 

Checking the ownership of the files yields the following:

fd@ubuntu:~$ ls -l
total 16
-r-sr-x--- 1 fd_pwn fd   7322 Jun 11  2014 fd
-rw-r--r-- 1 root   root  418 Jun 11  2014 fd.c
-r--r----- 1 fd_pwn root   50 Jun 11  2014 flag

So we know that the ‘fd’ executable has the SUID bit set, and will run as fd_pwn. We can’t access the flag directly, as it’s owned by fd_pwn, but presumably we can subvert the execution of the fd binary and access the flag file as the user that owns it. A quick look at the source of fd yields a contrived exploitable example, but one that still requires a couple of insights to solve:

#include 
#include 
#include 
char buf[32];
int main(int argc, char* argv[], char* envp[]){
        if(argc<2){
                printf("pass argv[1] a number\n");
                return 0;
        }
        int fd = atoi( argv[1] ) - 0x1234;
        int len = 0;
        len = read(fd, buf, 32);
        if(!strcmp("LETMEWIN\n", buf)){
                printf("good job :)\n");
                system("/bin/cat flag");
                exit(0);
        }
        printf("learn about Linux file IO\n");
        return 0;

}

The important call here is to read(), where data is read into a buffer from a file descriptor obtained by converting the user supplied string to integer and subtracting 0x1234 (4660 in decimal). If you're unfamiliar with linux's file descriptors read this first. The key insight is that processes in linux have numerical file descriptors assigned to them, by default each has 3 (0, 1 and 2) corresponding to STDIN, STDOUT and STDERR. In theory, if we can make the read() call obtain its data from the file descriptor 0, it can be tricked into obtaining data from standard input and placing it in the character buffer 'buf'. We can then simply supply the string LETMEWIN on the command line and obtain the flag. Supplying 4660 as the user input results in the fd being 0, and prompts for a string. And we're done!

fd@ubuntu:~$ ./fd 4660
LETMEWIN
good job :)
mommy! I think I know what a file descriptor is!!

Genetic Algorithms: Evolving “Hello World” in Javascript

I’ve always been extremely interested in evolution as an optimisation algorithm, this post is part of a series of write-ups on experiments using genetic techniques.

If you’re just interested in seeing the end result, check the demo here

Evolution is often described as being a “random” process. Some lay understandings of evolution imply that its process is not feasible given the number of “random” mutations required in order to produce a particular change or result in an organism. The key point neglected in these evaluations is that while mutations are a random component of evolution, the main power is its ability to accumulate “useful” mutations over time. The algorithm presented here is a simple illustration of the power of these kinds of algorithms, future articles with examine more advanced usage of the capabilities.

The problem

Generating the string “Hello, World!” (or any other) from random selections of letters. We start with an initial population of randomly generated strings and allow survivors whose string distance score is the smallest.

The problem space

Given the length of the string N, we will have to search through 26^n possibilities, leading to very rapid combinatorial explosion. For the “Hello, World!” string we will need to search through 2.4811529e+18 (that is, 2.48*10^18) different strings to find the one we’re looking for.

Strings are generated as candidates using the following function, resulting in each initial generation consisting of a series of randomly generated sequences of letters. Note that the alphabet doesn’t contain punctuation, even though it is used in the target string. This is because mutations will result in ASCII characters outside the specified range.

function random_string(len) {

  return new Array(len)
             .fill(0)
             .map((_) => ALPHABET.charAt(get_random_int(0,ALPHABET.length)))
             .join("")
  }

Elements of Genetic Algorithms

Since this example is very simple, it becomes easy to create a set of tunable parameters to investigate the effects on the resulting evolutionary process. These parameters are specified at the top of the JS file:

// string we're targetting
const TARGET_STR = "this string was produced by evolution";
// maximum length of initially created strings
const RANDOM_STR_MAX_LEN = 50;
// mutation rate. 40 means that on average each gene in 40 will be mutated 
const MUTATION_RATE = 40; 
// the tokens to draw from when creating initial strings
const ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" 
// the amount to increase or decrease each ASCII amount by during mutation
const MUTATION_STEPS = 2 
// penalty for being a different length from target string
const PENALTY_FACTOR = 100 
// initial generation size
const INITIAL_GENERATION_SIZE = 100;  
// number of candidates allowed to breed
const ALLOWED_TO_BREED = 40; 
// number of candidates allowed to survive
const ALLOWED_SURVIVORS = 40; 

Iteration

Each new generation is produced partially from the prior ones. Some of the best candidates will survive to the next round, and will also “breed” with other candidates to produce new candidates.

Fitness

After each round “fitness” should be evaluated. This is the digital equivalent of the survival/replication aspect of real evolution. A candidate’s fitness score is used in order to judge whether or not it will survive to the next generation. The algorithm described here uses a numerical distance factor created by summing the integer distance of the character codes of the candidate vs. the target string.

function fitness(str) {

  let smallest = Math.min.apply(null,[str.length, TARGET_STR.length]);
  let score = 0;
  score += Math.abs(str.length - TARGET_STR.length) * 100;

  for(let x=0;x<smallest;x++) {
    score += Math.abs(str.charCodeAt(x) - TARGET_STR.charCodeAt(x));
  }

  return score;

}

Mutation

Mutation introduces the ability to inroduce new genetic material in each round. Interesting results occur if the mutation function is removed or altered, and can result in a single dominant solution very early on with little or no new genetic material. For each child of the candidates in each generation, for every codon there is an 1 in MUTATION_RATE chance of a mutation occuring, which shifts a single letter’s character code by a random amount. If MUTATION_STEPS is 2, the random amount to shift will be between -1 and 1. If it’s 10, the random amount will be between -5 and 5.

function mutate(str) {

  return str
         .split("")
         .map((letter) => 
           (Math.round(Math.random()*MUTATION_RATE) == 2) 
           ? 
             String.fromCharCode(letter.charCodeAt(0)+Math.round(((Math.random()*MUTATION_STEPS)-(MUTATION_STEPS/2)))) 
             : 
             letter)
         .join("");

}

Breeding/Recombination

In order to retain the benefits of the optimal candidates, we cross-breed the best solutions. There are numerous ways to do this. One might be to simply take the odd numbered characters from one string, the even numbered characters from another string, and splice them together. Another could be to just use the first half of one string and the second half of the other. A solution often used in genetic algorithm research is creating “cut points” in the genomes of the two candidates. For instance, imagine two strings:

AAAAAAA
BBBBB
We select cut points 3 and 4 in string1, and 2 and 4 in string2, the resulting child ends up being:
  str1 0-3 AAA
  str2 2-4 BB
  str1 4-5 A
  = AAABBA

These functions are implemented as follows:

function breed(str1, str2) {

  let cp_1 = get_random_int(0, str1.length-1); 
  let cp_2 = get_random_int(cp_1, str1.length-1);
 
  // cp_1 is in the first string, defined as the first cut-point
  // this means we will take genetic material from str1 from 0 -> cp1
  // this will then be combined with the genetic material between cp_3 and cp_4 in str2
  // so we have str1[0,cp_1] + str2[cp_3,cp_4] + str1[cp_2,str1.size]

  let cp_3 = get_random_int(0, str2.length-1);
  let cp_4 = get_random_int(cp_3, str2.length-1);

  return (str1.slice(0, cp_1) + str2.slice(cp_3, cp_4) + str1.slice(cp_2, str2.length)).toString();

}

Put it all together and run it in the console and you end up with a generational history something like this:

 [lWTogtaMKlaXTHFozonCcqkpIrpESsEIqSSl, 916]
 [hduqCqurauqCqtrEsvrhprahDawErvrpvrpen, 349]
 [ieprBstrhtrDrtrEsvqbtaagDbwEavrputagn, 295]
 [hepsDrurirqCrdsCprrctaafBbxDavpptughn, 256]
 [iegsCquritgCrctBprpct#beBbxDcuonuthgn, 232]
 [uggtAqtrithBvbtBprpbtcbdBbxDcuonuthgn, 210]
 [uggs#rtrjnhAwasBprqctcid#ayCcvolurihn, 193]
 [ughs#rtrinh#xatAprpducdd#ayBdvnlutign, 180]
 [thhs#strinh#wat#prpcucdd#bxBdvnlutiin, 169]
 [this#strinh#wat#qroduced#byBdvnlutiin, 160]
 [uhis#strinh#wat#pqoduced#by#cvolutiin, 152]
 [this#string#xas8pqpduced#by#evoluthjn, 142]
 [this#string9was6qqoduced9by#evoluuijn, 134]
 [thhs9string7was8produced6by#evolutikn, 125]
 [this7strhng7was8prodtced5by#evolutiln, 122]
 [this7rtrgng5was2produced3bz#evolutimn, 113]
 [thhs6strgng3was2producfd2by8evolution, 105]
 [this6strgng1was2produced2by5evolution, 98]
 [thjs4string0was1producee0by3euolution, 91]
 [thjs3string0was1produced0by1fvolution, 87]
 [this1ssring#was#prodtced#cy#evplutjon, 80]
 [thit0string#was#prodtced#bz#evolution, 75]
 [this0string#was#prodtced#by#euolutjon, 69]
 [this#string#was#produced#by#evoluuion, 63]
 [this#ssring#was#produced#by#evolution, 59]
 [this#string#was#qroduced#by#fvolution, 55]
 [this#string#was#prodvced#by#evolution, 51]
 [this#string#was#psoduced#by#evolution, 46]
 [this#string#was#producdd#by#evolutipn, 38]
 [this#string#was#procuced#by#evplutioo, 26]
 [uhis#string#was#produced#by#evolutioo, 22]
 [this#stoing#was#produced#by evolution, 20]
 [shis#stting#was#produced#by evolution, 17]
 [tiis stting#was#produced bx evnlutiom, 11]
 [this#stting was#procuced bx evnlution, 9]
 [this#stsing was producee by evomution, 4]
 [this string wat producee by evolution, 2]

Feel free to try the live version here or check the code out from github to play with.

CMU Binary Bomb: Phase 2

This is part of a series of walk-throughs for the CMU Binary Bomb Lab. You can find Phase 1 here

Phase 2 is where things start to get a little interesting. Taking a quick look at the body of the phase_2 function as we did in the first phase, there’s some useful information to be obtained here:


There’s a few useful bits of information in here, first is the glaringly obvious function name:

0x08048b5b    e878040000   call sym.read_six_numbers

So it’s not a stretch to think that this phase takes six numbers as its input. The function itself confirms this:

Since scanf() is being called with a template string, checking the string at the memory location might give some insight into how the user input string is being processed.

[0x080488e0]> psz @ 0x08049b1b
%d %d %d %d %d %d

So six decimals is what we’re looking for. Looking at the CMP instruction indicates that if there are less than 5 numbers supplied, the bomb will explode.
Going back to phase_2, the next important part is the next CMP instruction before the explode_bomb call.

Writing out the pseudocode for this is helpful, the ESI register indicates we’re probably dealing with an array of some sort. EBP-0x18 is where the series of numbers is stored and EBX is a counter starting at 1 and ending at 6, which appears to be a loop.

This means the pseudocode is likely something like this:

function phase_2(number_array) {
  if (number_array[0] != 1) {
    explode_bomb()
  }
  *esi = num_array;
  for (ebx = 1; ebx < 6; ebx++) {
    *eax = ebx+1
    eax = eax * (*esi) + (ebx*4) - 4
    if (*(esi+ebx*4) != eax) {
     explode_bomb()
    }
  }

}

Doing a little more cleanup:

function phase_2(number_array) {
  if (number_array[0] != 1) {
    explode_bomb()
  }

  for (x=1; x <= 5; x++) {
    current = number_array[x-1] * x+1
    if (number_array[x] != current) {
      explode_bomb()
    }

  }
}

So the first number needs to be 1, and the subsequence numbers should be equal to the previous number * (current position +1). This makes the sequence:
1 (1*2) (2*3) (6*4) (24*5) (120*6)

= 1 2 6 24 120 720

Boom! done.

CMU Binary Bomb Lab: Phase 1

The binary bomb lab from CMU is a legendary learning exercise as part of the Computer Systems course at Cargenie Melon University.

A “binary bomb” is a program provided to students as an object code file. When run, it prompts the user to type in 6 different strings. If any of these is incorrect, the bomb “explodes,” printing an error message and logging the event on a grading server. Students must “defuse” their own unique bomb by disassembling and reverse engineering the program to determine what the 6 strings should be. The lab teaches students to understand assembly language, and also forces them to learn how to use a debugger.

Dealing with the bomb involves defusing 6 phases, I’ll be covering one phase per article.

Tools (across all 6 phases)

$ ./cmubomb 
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
> test phrase
BOOM!!!
The bomb has blown up.

Standard cracking techniques will work here, and with phase 1 there’s very little protection in place. A quick look at the disassembled source suggests that the code for each phase is contained within a function named phase_[number]. Radare2’s output is very useful once you get used to looking at it.

The phase_1 function is fairly simple. Using the PDF (print disassembled file) command along with sym.function_name allows viewing the function in isolation. Here’s an annotation of what it’s doing:

  push ebp
  mov ebp, esp
  sub esp, 0x8

This is the standard way of entering a function, putting EBP on the stack and swapping EBP with ESP, and then making space on the stack for local variables. If you’re unfamiliar with this sort of thing, be sure to read up on how stack frames work.

 mov eax, [ebp+0x8]

When you see numbers along with EBP, an easy way to translate what’s being handled is the following:

| [ebp + 16] (3rd function argument)
| [ebp + 12] (2nd argument)
| [ebp + 8]  (1st argument)
| [ebp + 4]  (return address)
| [ebp]      (old ebp value)
| [ebp - 4]  (1st local variable)

So the first argument to the phase_1 function (likely the string we entered) is being passed along in the next function call.

push str.Publicspeakingisveryeasy. ; 0x080497c0 
  push eax

Two arguments are being put on the stack in preparation for a function call. One is the function argument that was kept in eax, the other is a string of some sort.

call fcn.08049030 (sym.strings_not_equal)

So we’re calling strings_not_equal with two strings, in a piece of code that checks passwords.

 
  add esp, 0x10
  test eax, eax
  jz 0x8048b43

The strings_not_equal function puts its result in eax. Since the `test eax, eax` instruction is simply a fast way of checking if the value is zero, it’s quite probably that strings_not_equal returns zero if the strings are equal, much like strcmp.

call sym.explode_bomb

If the result of the strings_not_equal function is zero, the explode_bomb call would’ve been avoided.

  mov esp, ebp
  pop ebp
  ret

This does the opposite of what the intro to the function did, in prep for returning from the called function. Putting all this together it should be fairly clear what the phase_1 function does:

phase_1(password) {
  if (strings_not_equal(password, stored_password) > 0) {
    explode_bomb()
  } else {
    return
  }
}

Since radare2 gave us the location of the password string in memory, it’s very simple to grab it:

[0x080488e0]> psz n @ 0x080497c0 
Public speaking is very easy.

`psz` prints the null-terminated string at the location specified, and that’s our password for Phase 1!

Modern Binary Exploitation Labs: Reversing crackme challenges (levels 0-3)

Introduction

Modern Binary Exploitation is an undergraduate course released for free by RPISec. The course content covers basic and intermediate topics, and describes itself as follows:

Modern Binary Exploitation will focus on teaching practical offensive security skills in binary exploitation and reverse engineering. Through a combination of interactive lectures, hands on labs, and guest speakers from industry, the course will offer students a rare opportunity to explore some of the most technically involved and fascinating subjects in the rapidly evolving field of security.

The course will start off by covering basic x86 reverse engineering, vulnerability analysis, and classical forms of Linux-based userland binary exploitation. It will then transition into protections found on modern systems (Canaries, DEP, ASLR, RELRO, Fortify Source, etc) and the techniques used to defeat them. Time permitting, the course will also cover other subjects in exploitation including kernel-land and Windows based exploitation.

The first set of labs involve “Tools and Basic Reverse Engineering”, and include a set of crackme exercises. In the rest of this article I’ll be walking through the process of reversing the files in order to determine the solution.

Tools

Radare2

Radare is a unix-like reverse engineering framework and commandline tools http://www.radare.org/

GDB (with pwndbg)

GDB is a standard tool for debugging/exploit development, originally I was using it with the PEDA toolkit. Recently I discovered that pwndbg actually provides a lot more useful functionality and a consistent interface, so it’s a worthy addition to any RE toolkit.

Snowman

Snowman is a native code to C/C++ decompiler. Although it’s not perfect, it can provide some useful analysis of control flow at the least https://derevenets.com.

Crackme0x00a

First binary. When executed, we’re greeted with the following prompt:

Enter password: hi
Wrong!
Enter password: yes
Wrong!
Enter password: no
Wrong!
Enter password: password
Wrong!

So we need to figure out what the password is somehow. Let’s load this one into radare2 and see what kind of output comes out. Some useful functions for radare2 are ‘aa’ (analyze all) and ‘pdf’ (print decompiled function). Let’s take a look at the main function:

I’m going to go into the most detail on the first few crackmes, with progressively less of the basics as we go through. With some basic inspection it’s possible to see what this binary is doing.

Knowing how function calls work we can see that this code is using the scanf() function with the parameter %d, for instance.
Given that we’re looking for a password, and this code contains a call to strcmp (the standard C library to compare two strings) it seems likely that this is how the password is being checked against our input. There’s now a few ways of figuring out this password, each of which I’ll walk through as this illustrates the process nicely. Here’s the disassembled call to strcmp.

Method 1: Look through objdump

Looking at the disassembled instructions for strcmp it’s evidence the symbols are still present, and the password looks like it’s stored in sym.pass.1685. Running objdump -D on crackme0x00a and looking for pass.1685 turns up this:

0804a024 :
 804a024:       67 30 30                xor    %dh,(%bx,%si)
 804a027:       64 4a                   fs dec %edx
 804a029:       30 42 21                xor    %al,0x21(%edx)
 804a02c:       00 00                   add    %al,(%eax)

The second column is the data, and seems to be a null-terminated 8 character string (two hex digits = one byte) consisting of 67 30 30 64 4a 30 42 21. Mapped to decimal digits these become 103, 48, 48, 100, 74, 48, 66, 33, which are the ASCII codes for g 0 0 d J 0 B !. Which turns out to be the password!

Method 2: Check output of ‘strings’

`strings` is a small system utility that prints strings stored in binaries. Since we only want to see what’s in the data section the -d switch is helpful, resulting in this output:

$ strings -d ./crackme0x00a 
/lib/ld-linux.so.2
__gmon_start__
libc.so.6
_IO_stdin_used
__isoc99_scanf
puts
__stack_chk_fail
printf
strcmp
__libc_start_main
GLIBC_2.7
GLIBC_2.4
GLIBC_2.0
PTRh
D$,1
T$,e3
UWVS
[^_]
Enter password: 
Congrats!
Wrong!
;*2$"
g00dJ0B!

Method 3: Set a breakpoint at strcmp and examine in GDB

GDB makes it very easy to view code as it’s happening. Breakpoints can be set in the code, which are the “interesting” parts where we want to stop and examine what’s going on. A really useful breakpoint might be the strcmp function, because then we can see the two arguments being passed (our string and the target string). Starting up GDB for crackme0x00a and using the `break *0x0804852a` command, we can now execute `run` and should see something like this:

The lower third contains a visualization of the stack, and the top two items? Our comparison string and the password.

crackme0x00b

Let’s check the main() function for this one with the same commands as before: ‘aa’ followed by ‘pdf’:

Being able to map out the pseudocode for disassembled instructions in your head can be useful, since this crackme is similar to the last let’s start out with the pseudocode for the one we already solved (note, I’m not an expert at this by any means, I’m learning as I go):

main() {
  printf("Enter password: ");
  password = scanf("%s");
  while (strcmp(password, stored_password) != 0) {
    puts "Wrong!"
    password = scanf("%s");
  }
  puts "Congrats!"
}

But in our new crackme, the call is instead to wcscmp, and the scanf function uses %ls instead of %s. A little bit of research suggests that wcscmp is strcmp but for “wide” characters, and that %ls reads a “Pointer to wchar_t string”. So we’re dealing with something really similar to the first example but for wide characters. So we’re looking at:

main() {
  printf("Enter password: ");
  password = scanf("%ls");
  while (wcscmp(password, stored_password) != 0) {
    puts "Wrong!"
    password = scanf("%ls");
  }
  puts "Congrats!"
}

Running `strings` on the file doesn’t particularly help either. But method #1 from the previous example can work. Finding the relevant section in the objdump yields the following:

0804a040 :
 804a040:       77 00                   ja     804a042 
 804a042:       00 00                   add    %al,(%eax)
 804a044:       30 00                   xor    %al,(%eax)
 804a046:       00 00                   add    %al,(%eax)
 804a048:       77 00                   ja     804a04a 
 804a04a:       00 00                   add    %al,(%eax)
 804a04c:       67 00 00                add    %al,(%bx,%si)
 804a04f:       00 72 00                add    %dh,0x0(%edx)
 804a052:       00 00                   add    %al,(%eax)
 804a054:       65 00 00                add    %al,%gs:(%eax)
 804a057:       00 61 00                add    %ah,0x0(%ecx)
 804a05a:       00 00                   add    %al,(%eax)
 804a05c:       74 00                   je     804a05e 
 804a05e:       00 00                   add    %al,(%eax)
 804a060:       00 00                   add    %al,(%eax)

The wikipedia page for wchar_t states that “the width of wchar_t is compiler-specific and can be as small as 8 bits”. It looks here like they’re 32bits (reading down the second column). I did a quick test on my own system using the following program:

#include "wchar.h"

int main() {
  printf("%d", sizeof(wchar_t));
}

which outputs 4 (bytes), and so 32 bits. The bytes end up being 77 30 77 67 72 65 61 74, which translated into ASCII codes gives: “w0wgreat”. And this is the password!

crackme0x01

Our disassembled main() function:

The important part of this program is here:

Giving us a proposed code structure of something like the following:

main() {
  printf("IOLI Crackme Level 0x01\n");
  printf("Password: ");
  password = scanf("%d");
  if (password == 5247) {
    printf("Password OK!");
  } else {
    printf("Invalid Password");
  }
  return 0;
}

I mentioned the snowman decompiler in the intro, running crackme0x01 through the ‘nocode’ tool it supplies gives us the following (I’ve found nocode useful so far for mapping out the structure of a binary and figuring out the control flow):

int32_t main() {
    void* v1;
    void* v2;
    int32_t v3;

    fun_804831c("IOLI Crackme Level 0x01\n", v1);
    fun_804831c("Password: ", v2);
    fun_804830c("%d", reinterpret_cast(__zero_stack_offset()) - 4 - 4);
    if (v3 == 0x149a) {
        fun_804831c("Password OK :)\n", reinterpret_cast(__zero_stack_offset()) - 4 - 4);
    } else {
        fun_804831c("Invalid Password!\n", reinterpret_cast(__zero_stack_offset()) - 4 - 4);
    }
    return 0;
}

The scanf reads in a single decimal digit (%d), and then performs a cmp against some value (0x149a), jumping to “Password OK” if it’s equal. Therefore if we try the number 5274 it should work!

crackme0x02

Starting out with the radare2 output for the main function again:

Much of the structure is similar to the previous exercises, the part that’s different is the important segment with the imul operation here:

What might the pseudocode for this look like? Remember that the layout of the stack frame in IA32 architecture keeps local variables at locations below the stack pointer, so the definitions up the top are referencing two local variables at ebp-0x8 and ebp-0xc and assigning the values 0x5a (90) and 0x1ec (492) respectively. I’ve annotated the disassembly with what operations the parts of the code correspond to:

0x0804842b    c745f85a000. mov dword [ebp-0x8], 0x5a ; a = 90
0x08048432    c745f4ec010. mov dword [ebp-0xc], 0x1ec ; b = 492
0x08048439    8b55f4       mov edx, [ebp-0xc]
0x0804843c    8d45f8       lea eax, [ebp-0x8]
0x0804843f    0110         add [eax], edx ; a = a + b
0x08048441    8b45f8       mov eax, [ebp-0x8]
0x08048444    0faf45f8     imul eax, [ebp-0x8] ; a = a * a
0x08048448    8945f4       mov [ebp-0xc], eax
0x0804844b    8b45fc       mov eax, [ebp-0x4]
0x0804844e    3b45f4       cmp eax, [ebp-0xc]  ; if (password == a)
0x08048451    750e         jnz 0x8048461

The password is compared with the value of ((90 + 492) * (90 + 492)) which results in 338724, which is the password!.
Interestingly, running this code through snowman resulted in some of the excess operations being folded and optimised out. The function ends up being:

int32_t main() {
    void* v1;
    void* v2;
    int32_t v3;

    fun_804831c("IOLI Crackme Level 0x02\n", v1);
    fun_804831c("Password: ", v2);
    fun_804830c("%d", reinterpret_cast(__zero_stack_offset()) - 4 - 4);
    if (v3 != 0x52b24) {
        fun_804831c("Invalid Password!\n", reinterpret_cast(__zero_stack_offset()) - 4 - 4);
    } else {
        fun_804831c("Password OK :)\n", reinterpret_cast(__zero_stack_offset()) - 4 - 4);
    }
    return 0;
}

Spoiler alert! The hex value 0x52b24 corresponds to the decimal value of 338724.

crackme0x03

This one is extremely similar to the last. In fact, the solution is the same. When disassembling the binary with radare2 you might notice a slight difference. Now, instead of directly comparing the password against the stored password in the main() function, it calls a test() function that looks like this:

The structure of this code shouldn’t really be too difficult to understand, here’s what it might look like in pseudocode:

test(a, b) {
  if (a == b) {
    shift("SdvvzrugRN")
  } else {
    shift("LqydolgSdvvzrug")
  }
}

The fact that the function is named “shift” and that it contains what seems like random letters should be a clue, as is the structure of the test() function. I suspect this is using a simple caesar cipher and rotating the letters. I was looking for a quick way to test this theory, so I hacked up some ruby to rotate the characters in the string and see if I got anything interesting.

require 'pp'

str = "LqydolgSdvvzrug"

alpha = ('a'..'z').to_a + ('A'..'Z').to_a

1.upto(26) do |i|
  pp str.split("").map { |s| (s.ord - i).chr }.join
end

The resulting output is:

"KpxcnkfRcuuyqtf"
"JowbmjeQbttxpse"
"InvalidPassword"
"Hmu`khcO`rrvnqc"
"Glt_jgbN_qqumpb"
[..truncated..]

So we’re looking at strings rotated 3 places to the left. None of this was really necessary, but there you go. I hope you enjoyed this walkthrough, I’ve split the overall guide into several sections else they can get pretty long, so look forward to more soon!.