Skip to content

Ray is coding Posts

Writing a NES emulator in Javascript: Part 5 – Debugging the CPU

I’ll be honest with you, I’m really not used to working at this low a level. Bitwise math hasn’t been my strength, so it has been important to ensure I have some sort of reference to work from. The popular choice when working with NES emulators is the so-called “golden log” produced by Nintendulator, a particularly accurate NES emulator. By reading through the log and writing a similar output system for my own CPU, it’s easily possible to hand-check the accuracy of the opcodes. After doing this for about a week I found it pretty tedious however, and decided I could speed things up a bit

Automated Testing

A simple solution seemed to be to parse the nestest log and turn it into a javascript-friendly object, and then run the CPU instruction-by-instruction and check the reference. I opted to just check A, X, Y, P and SP, as I still really am not sure how the cycle count works (more on that in another article). First step was to write a simple class that read/parses the nestest log:

class NEStest {
  constructor() {
    this.code = [];
    var fs = require("fs");
    var nt = fs
      .readFileSync("nestest.log", "utf-8")
      .split("\n")
      .filter(x => x != "");
    nt.forEach(line => {
      let segments = line.split(/[ ]{2,}/);
      let line_hash = { flags: { PC: segments[0] } };
      let flags = segments[3].split(/ /);

      flags.forEach(flag_str => {
        var parts = flag_str.split(":");
        line_hash.flags[parts[0]] = parts[1];
      });

      this.code[parseInt(segments[0], 16)] = line_hash;
    });
  }
}

I then added a test() function to the emulator class to be an alternative to boot(). In order to make this code work, it was necessary to modify the CPU.execute() function to take an optional argument that specified how many instructions to execute. Allowing a step-by-step CPU execution enabled checking against the log entries. The code tests against the nestest reference line-by-line until one of the registers doesn’t match the expected value and then bails:

  test() {
    let test_suite = new nestest();

    // put the CPU into the default state
    this.cpu.reset();

    let running = true;

    while (running) {
      /* executing with a numerical argument executes a limited number of instructions */
      this.cpu.execute(1);

      /* look up the comparison line in the nestest object */
      let test_line = test_suite["code"][this.cpu.registers.PC];

      if (!test_line) {
        /* there's no nestest.log entry with this address
        it's safe to assume something went really wrong */
        console.log("CPU address not expected!");
        process.exit();
      }

      Object.entries(this.cpu.registers).forEach(register => {
        if (
          parseInt(this.cpu.registers[register[0]]) !=
          parseInt(test_line.flags[register[0]], 16)
        ) {
          let output_message = `Registers ${register[0]} mismatch\n`;
          output_message += `Got ${this.cpu.registers[register[0]].toString(16)}, expected ${test_line.flags[register[0]]}`;

          console.log(output_message);
          console.log(this.cpu.registers);
          console.log(this.cpu.flags);
          running = false;
        }
      });
    }
  }

This really has made it easier to debug the CPU functionality, as it results in output such as this where there’s an implementation issue:

c812 Beginning execution at 51218			[A: 80 X: 0 Y: 0 P: ad SP: fb CYC: 374]
c812 LDA #$0			[A: 80 X: 0 Y: 0 P: ad SP: fb CYC: 374]
0
c814 Beginning execution at 51220			[A: 0 X: 0 Y: 0 P: 2f SP: fb CYC: 378]
c814 SEC			[A: 0 X: 0 Y: 0 P: 2f SP: fb CYC: 378]
c815 Beginning execution at 51221			[A: 0 X: 0 Y: 0 P: 2f SP: fb CYC: 381]
c815 PHP			[A: 0 X: 0 Y: 0 P: 2f SP: fb CYC: 381]
c816 Beginning execution at 51222			[A: 0 X: 0 Y: 0 P: 2f SP: fa CYC: 385]
c816 PLA			[A: 0 X: 0 Y: 0 P: 2f SP: fa CYC: 385]
c817 Beginning execution at 51223			[A: 3f X: 0 Y: 0 P: 2d SP: fb CYC: 390]
c817 AND #239			[A: 2f X: 0 Y: 0 P: 2d SP: fb CYC: 390]
Registers P mismatch
Got ad, expected 2D
{ PC: 51225, SP: 251, P: 173, A: 47, X: 0, Y: 0 }
{ carry: true,
  zero: false,
  interrupt_disable: true,
  decimal_mode: true,
  break_command: false,
  overflow: false,
  negative: true,
  decimal: false }

Writing a NES emulator in Javascript: Part 4 – Opcodes

In previous articles I talked about the initial process of getting the PRG ROM loaded and executing. Now comes the somewhat lengthy process of implementing all the opcodes that were available on the NES CPU. The fact that the NES brain is so well documented is a huge plus here, including the 6502 programming manual and this opcode list.

Despite what I mentioned in a prior article, I’m no longer using a giant switch statement for all the opcodes. They’re now stored in an object in a separate file, with the name of the opcode (as a constant) being the key. In another file (optable.js) I store information about the instructions, such as byte length and cycle count.

Implementing the instructions

The process I’ve been following for implementing the instructions worked as follows:

  1. Have a good debugger output for instructions as they execute
  2. Execute the instructions and crash when we encounter one that hasn’t been implemented yet, outputting the name of the unimplemented instruction. This can then be looked up in the reference doc and implemented
    while (this.running == true) {
      let opcode = this.fetch(this.registers.PC);

      if (!opcode) {
        /* we've reached the end of execution, here be dragons */
        break;
      }

      /* Debugging code, causes the unimplemented opcode to be written out and the program
         to bail. Useful during development. */

      if (
        typeof optable[opcode] === "undefined" ||
        typeof optable[opcode].op === "undefined"
      ) {
        console.log(
          "Unimplemented opcode: " +
            opcode.toString(16) +
            " at " +
            this.registers.PC.toString(16)
        );
       this.running = false;
       continue;
      }
      /* the bind() call makes the 'this' object the CPU when the opcode closure is executed */
      optable[opcode].op.bind(this).call();

I’ve omitted some code but you get the idea, the opcodes are being pulled from the memory using the Program Counter (CPU.registers.PC), and then get looked up in the optable module. The optable.js file contains entries such as this:

optable[0xEA] = {
  cycles: 2,
  bytes: 1,
  name: "NOP",
  op: opcodes["NOP"]
};

All the opcodes are contained within a file called ops.js, where each opcode function is stored as a value in an object with the key being the opcode byte. Instead of writing extra code to decode the addressing format, it seemed simpler to just implement a function for each addressing mode. The function calls are so small (often one line) that the duplication makes things simpler than writing an addressing mode decoder, but this may have turned out to be a bad choice further down the line. So taking an opcode like LDA in the Indirect addressing mode using the X register we’d have something like:

  LDA_IND_X: function() {
    this.log(
      "LDA ($" + this.next_byte().toString(16) + "), X",
      this.registers.PC
    );
    this.registers.A = this.fetch(this.next_byte() + this.registers.X);
  },

This approach works fine, but I needed some way to be able to test for emulation accuracy. In the next article I’ll discuss the testing system I set up to ensure fidelity in executing the NES code.

Better know an algorithm: Burrows-Wheeler transformation

Introduction

I’ve always thought this algorithm was particularly awesome, and you’ll see why. Here’s the description by wikipedia:

The Burrows–Wheeler transform (BWT, also called block-sorting compression) rearranges a character string into runs of similar characters. This is useful for compression, since it tends to be easy to compress a string that has runs of repeated characters by techniques such as move-to-front transform and run-length encoding.

So a string such as “BANANA” is rendered as “BNN|AAA” (the | denotes EOF, we’ll cover this later). The algorithm can take a piece of text with repeating _sequences_ and transform it into a string of repeating letters. Remarkably, this process is reversible. The strings that benefit the most from this kind of transformation are ones with repeating sets of characters. Imagine a DNA sequence such as follows: “CGATCGATCGATCGATCGAT”. This would not be typically compressible by a scheme such as Run-Length Encryption. But after a BWT application this becomes “GGGGG|TTTTCCCCCAAAAAT”, which can be efficiently compressed. And on decompression the string can be transformed back into its original string form.

The Algorithm

def bw(string)
  string += "|"
  columns = []
  0.upto(string.length - 1) do |n|
    columns.push(string.chars.rotate(n*-1))
  end
  columns.sort!.map { |c| c[-1] }
end

    Steps

  • Add an EOF (“|”) character to mark the end of the string, this is needed for reversing the process
  • Create an array of cyclic rotated versions of the string
  • Sort the array lexographically, so similar sequences are next to each other
  • Take the last column

In the canonical example of “BANANA” we go through the following steps:

    BANANA

    BANANA|

    BANANA|
    |BANANA
    A|BANAN
    NA|BANA
    ANA|BAN
    NANA|BA
    ANANA|B

    ANANA|B
    ANA|BAN
    A|BANAN
    BANANA|
    NANA|BA
    NA|BANA
    |BANANA
  

Why does this work?

Initially, I implemented the algorithm and then just stared at it, wondering how the hell it could possibly work. The key insight is that it works primarily with repeating sequences of characters. The DNA sequence example shows this more clearly:

 ATCGATCGAT|CG
 ATCGAT|CGATCG
 AT|CGATCGATCG
 CGATCGATCGAT|
 CGATCGAT|CGAT
 CGAT|CGATCGAT
 GATCGATCGAT|C
 GATCGAT|CGATC
 GAT|CGATCGATC
 TCGATCGAT|CGA
 TCGAT|CGATCGA
 T|CGATCGATCGA
 |CGATCGATCGAT

By grouping together the similar substrings, letters that precede the said substring end up aligned next to each other. Pretty cool. While this works, there’s a caveat. This isn’t particularly efficient as it requires holding N rotations of the string in memory. In the next article I’ll be covering how to improve the space requirements, and more importantly how to reverse the process.

Writing a NES emulator in Javascript: Part 3 (booting the first code)

  Note: This is part of a series of posts on writing a NES emulator in Javascript. You can follow the project on github here.

We need a few things to get NES code booting. One of these will be a NES ROM. Doing this I’ll be using the nestest ROM due to copyright restrictions surrounding using an official Nintendo ROM.

iNES format

The format used by most emulators for reading games is the iNES format, which contains the following sections:

Header (16 bytes)
Trainer, if present (0 or 512 bytes)
PRG ROM data (16384 * x bytes)
CHR ROM data, if present (8192 * y bytes)
PlayChoice INST-ROM, if present (0 or 8192 bytes)
PlayChoice PROM, if present (16 bytes Data, 16 bytes CounterOut) (this is often missing, see PC10 ROM-Images for details)
(source: nesdev wiki)

With the header being formatted as follows:

0-3: Constant $4E $45 $53 $1A (“NES” followed by MS-DOS end-of-file)
4: Size of PRG ROM in 16 KB units
5: Size of CHR ROM in 8 KB units (Value 0 means the board uses CHR RAM)
6: Flags 6
7: Flags 7
8: Size of PRG RAM in 8 KB units (Value 0 infers 8 KB for compatibility; see PRG RAM circuit)
9: Flags 9
10: Flags 10 (unofficial)
11-15: Zero filled
(source: nesdev wiki)

Since I have really no idea what to do with the CHR ROM data yet, it’s the PRG_ROM data we’re interested in, as this contains the executable game code. The nestest ROM header looks like this (hexedit output):

00000000   4E 45 53 1A  01 01 00 00  00 00 00 00  00 00 00 00

The bytes 0x4E, 0x46 and 0x53 are the “NES” string. We can see that bytes 4 and 5 are 1, which means the PRG_ROM will be 16KB and the CHR ROM will be 8KB. Therefore the PRG_ROM starts at 0x10 (we don’t have a trainer file) and continues for 16384 bytes. The quick and dirty ROM parser I wrote does the trick of grabbing the data and putting it in a UInt8Array:

    this.rom = new Uint8Array(this.rom_buffer);
    this.prg_r1_size = this.rom[PRG_ROM_SIZE_INDEX];
    this.chr_size = this.rom[CHR_ROM_SIZE_INDEX];
    const ROM_SIZE = ROM_MULTIPLE_SIZE*this.prg_r1_size;
    this.prg_rom = new Uint8Array(this.rom_buffer.slice(PRG_ROM_START_INDEX, ROM_SIZE));

Will it boot?

The NES does some set-up on each reset that gets flags and registers in the right place to begin executing. One thing to remember is that the program ROM is mapped to a region in memory. From the NESdev wiki it seems that the PRG ROM is loaded at 0xC000. In the memory mapping function this is simulated as follows:

fetch(addr) {
if(addr >= 0 && addr <= 0x7FF) { 
  [..] 
} else if (addr > 0x4020 && addr < 0xFFFF) { 
  if (addr >= 0xC000 && addr < 0xFFFF) {
    return this.rom[(addr-0xC000)];
  }
}

The fetch() function pulls a byte out of memory, and if the requested byte index is from 0xC000 to 0xFFFF we want to return the corresponding bytes from the program ROM.

One small problem

Since the code is only for the CPU, if the ROM attempts to use the PPU the emulator will fail. There’s a small workaround. The default reset vector for the emulator is $C004 but while using the nestest ROM there’s an automated mode that starts at $C000 (thanks to this forum post). For this reason, it was necessary to hardcode the reset vector as $C000 to get things rolling. Since I wrote the last part of this article, the structure of the code change, and the giant switch statement was removed. Here’s what the code to get instructions executing looks like now:

  while (this.running == true) {

    let opcode = this.fetch(this.registers.PC);
    optable[opcode].op.bind(this).call();
    [...]

And things are actually booting! This ended up being a hodgepodge of an article due to being written over a long period in which huge structural changes were happening to the code. In the next article I’ll discuss some of the changes I’ve made to improve execution, and how I went about implementing the opcode table.

Check out the current code here.

Writing a NES emulator in Javascript: Part 2 (starting on the CPU)

Let’s start with the most obvious choice: The CPU. The NES’s 2A03 CPU is a modified 6502 processor. There’s an interesting story behind the choice of processor:

“The Nintendo core processor was a 6502 designed with the patented technology scraped off, We actually skimmed off the top of the chip inside of it to see what it was, and it was exactly a 6502. We looked at where we had the patents and they had gone in and deleted the circuitry where our patents were.”

Although there were changes, the NES microprocessor ran 99% of the 6502 instruction set. “Some things didn’t work quite right or took extra cycles,”

One important difference is the removal of the decimal mode, which was patented technology. The flag still exists, but is permanently disabled.

The focus of this implementation was to get code working, and so no effort is being made to optimise anything. For instance, some of the bitwise function are implemented using string operations and will be switched over later.

In order to test against a reference, I’m using nestest.nes which the docs describe as follows:

This here is a pretty much all inclusive test suite for a NES CPU. It was designed to test almost every combination of flags, instructions, and registers. Some of these tests are very difficult, and so far, Nesten and Nesticle failed it. Nintendulator passes, as does a real NES (naturally). I haven’t tested it with any more emulators yet.

I attempted to check the states of all flags after most instructions. For example, CPY and CMP shouldn’t affect the overflow flag, while SBC and ADC should. Likewise, all forms of wrapping ARE tested for- zeropage wrapping being the tests most emulators fail.

Starting the code

The log output for nestest.nes from Nintendulator is used as a reference. The CPU is represented by a class that contains the processor logic and attributes for the various parts of the CPU:

 

class CPU {

  constructor() {
    this.memory = new Memory;
    this.registers = {PC: 0, SP: 0, P:0, A:0, X: 0, Y: 0};
    this.flags = {carry: false, zero: false,
      interrupt_disable: true, decimal_mode: false,
      break_command: true, overflow: false, negative: false};
    this.running = false;
    this.cycles = 0;
    this.logger = new Logger;
    this.stack = new Array;
    this.log("CPU Initialized");
  }

Memory access is split out into its own class which contains functionality for getting and setting memory. The NES includes memory mapped I/O Implementing memory mapping in the CPU involves mapping particular addresses to different hardware elements. The fetch() function of the memory class accomplishes the memory mapping for retrieval detailed here:

 

  fetch(addr) {
    if(addr >= 0 && addr <= 0x7FF) {
      // 0000-00FF Zero-paged region
      // 0100-01FF - Stack Memory
      return this.ram[addr];
    } else if (addr >= 0x800 && addr < 0x0FFF) {
      // mirrors RAM
      return this.ram[addr-0x800];
    } else if (addr >= 0x1000 && addr < 0x17FF) {
      // mirrors RAM
      return this.ram[addr-0x1000];
    } else if (addr >= 0x1800 && addr < 0x1FFF) {
      // mirrors RAM
      return this.ram[addr-0x1800];
    } else if (addr >= 0x2000 && addr < 0x2007) {
      // NES PPU registers
    } else if (addr >= 0x2008 && addr < 0x3FFF) {
      // Mirrors of $2000-2007
    } else if (addr >= 0x4000 && addr < 0x4017) {
      // NES APU and I/O registers
    } else if (addr >= 0x4018 && addr < 0x401F) {
      // APU and I/O functionality
    } else if (addr > 0x4020 && addr < 0xFFFF) {
      if (addr >= 0xC000 && addr < 0xFFFF) {
        return this.rom[(addr-0xC000)];
      }
    }

  }

We'll get to how the ROM part works in the next article.

I opted to implement the opcodes as a giant switch() statement, which while not optimal will get the job done. An alternative approach I could've used was the one Imran Nezar used of creating an object containing all the operations as values, with the opcodes as the keys.:

  execute() {

    this.log("Beginning execution at " + this.registers.PC);
    this.running = true;

    while(this.running == true) {

      let opcode = this.memory.fetch(this.registers.PC);
      switch(opcode) {

      case ops.NOOP:
        let nb1 = this.memory.fetch(this.registers.PC+1);
        let nb2 = this.memory.fetch(this.registers.PC+2);
        let j = utility.Utility.merge_bytes(nb1, nb2);
        this.log("JSR " + j, this.registers.PC);
        let bytes = utility.Utility.split_byte(this.registers.PC+3);
        this.stack.push(bytes[0]);
        this.stack.push(bytes[1]);
        this.registers.SP -= 2;
        this.registers.PC = j;
      break;
      case ops.TSX:
          [...]
      break;
      case ops.TXS:
          [...]
      break;
      case ops.TYA:
          [...]
      break;

In the next article I'll talk a little about how to load a ROM file and how to get the code booting.

Writing a NES emulator in JavaScript: Part 1

Warning: these may be some long articles

Introduction & Inspiration

I’ve always wanted to write an emulator, I’ve found the concept fascinating, and I thought it was finally time to complete a emulation project. I went with the NES for the console, and JavaScript as the language. The reason being the NES is well documented, there are functional emulators out there with the ability to provide debugging output, and I owned a NES myself.

 

About Emulators

If you’re unfamiliar with how emulators work, aside from that they let you play videogames on your computer, the basic explanation is that the software you’re using is a code version of the original concept. Instead of the NES hardware reading and acting on the instructions contained in the ROM code on the cartridge, a program handles the CPU instructions, creates its own RAM, and maps graphical output instructions to visual output on your screen. This is incredibly complex, and hard to get the hang of, but it’s also a lot of fun, and provides some interesting problems to solve.

The height of technological progress

For instance, the NES hardware is a little idiosyncratic, and often restrictive in what it allowed programmers to be able to do. As a result, NES game programmers learned to exploit aspects of the hardware in very interesting ways, relying on the exact hardware implementation’s quirks to produce the result. This could mean that a “perfect” emulator that runs each component exactly may not run games properly.

The Hardware

Let’s take a look at the NES hardware. The NESdev wiki has been an amazing resource so far, and it’s a great place to start from. In these really early stages, gathering an idea of the hardware to be implemented and a rough sketch of how things fit together would be useful.

The goal of the first phase was to get a rough outline of the code and get a few instructions working. In order to do this a few things need completing:

  1. Basic CPU emulation What flags and registers does the CPU use? How does it calculate clock cycles for synchronization?
  2. Memory Mapping How do different parts of hardware map to addresses in memory? (this is the primary way that the CPU communicates with other hardware parts)
  3. Basic ROM Loading. Where is the executable data stored in the data from the cartridge?
  4. Ability to debug instructions Let’s be real, I have no clue what I’m doing with a lot of this, I’m going to need something to check against when I get stuck.

The CPU

The NES CPU core is based on the 6502 processor and runs at approximately 1.79 MHz. Reading this PDF helped immensely. The 6502 has the following registers:

  • A Accumulator Register
  • X, Y Index Registers
  • SP Stack Pointer
  • P Program Counter
  • S Status Flag

At its most simplest, a CPU is only really doing a few simple things. We fetch a byte from memory, decode the opcode, and take some action based on it. And then we do it again. And again. Until we have a reason not to. The CPU is where I’m going to start, and in the next post I’ll talk a bit about how that’s implemented.

Conway’s Game of Life (JavaScript)

The Game of Life, also known simply as Life, is a cellular automaton devised by the British mathematician John Horton Conway in 1970.

.

Video recording of simulation, live demo can be found here.

Conway’s Game of Life is awesome for being the simplest example of artifical life. It implements a rule-based system for determining whether elements in a grid should be alive or dead, with a very simple rule system. But like with DNA or Ant Colonies (read about it!) a simple set of rules can result in an amazing array of diverse behaviour.

Conway’s Game of Life isn’t hard to implement, and seems like a great target for optimisation. The logic is pretty simple, we need a grid of cells. A lot of people
use two-dimensional arrays for this, but I opted for a 1D array where the rows/columns are calculated through a translation function. Each cell contains a value of 0 or 1 for dead or alive.

The Code

class Grid {
  constructor() {
    this.grid = new Uint8Array(GRID_SIZE);
    this.next = new Uint8Array(GRID_SIZE);
    this.init();
  }

  dump() {
    return this.next;
  }
  swap() {
    this.grid = this.next;
    this.next = new Uint8Array(GRID_SIZE);
  }
  at(x, y) {
    return y * GRID_WIDTH + x;
  }
  init() {
    this.grid = fast_random(GRID_SIZE);
  }

The first part of the Grid class, the main meat of the code, is pretty simple. The Grid stores a current grid and a next grid, which is updated and then swapped with the current one during animation cycles. When the Grid is initialized it grabs GRID_SIZE bits using the fast_random function (more on this later).

Survival Function

For each cell within the “grid”, we need to calculate the number of neighbours to figure out its fate in the next generation. The rules for the game are as follows:

Any live cell with fewer than two live neighbours dies, as if caused by underpopulation.
Any live cell with two or three live neighbours lives on to the next generation.
Any live cell with more than three live neighbours dies, as if by overpopulation.
Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

 

  update() {
    for (let x = 0; x < GRID_WIDTH; x++) {
      for (let y = 0; y < GRID_WIDTH; y++) {
        let score = 0;
        const CURRENT_LOCATION = this.at(x, y);
        for (let i = -1; i < 2; i++) {
          for (let j = -1; j < 2; j++) {
            if (i == 0 && j == 0) {
              // skip self
            } else {
              score += this.grid[this.at(x + i, y + j) % GRID_SIZE];
            }
          }
        }

Here we’re just looping over every cell of the grid, and for each of the cells adding up the 1s in the surrounding 8 cells. From this we can figure out whether it’s going to be alive or dead based on its current state and its neighbour count:

        switch (this.grid[CURRENT_LOCATION]) {
          case 0:
            if (score == 3) {
              this.next[CURRENT_LOCATION] = 1;
            } else {
              this.next[CURRENT_LOCATION] = 0;
            }
            break;
          case 1:
            if (score > 3 || score < 2) {
              this.next[CURRENT_LOCATION] = 0;
            } else {
              this.next[CURRENT_LOCATION] = 1;
            }
            break;
        }
      }

Pretty simple!

Rendering the Output

I wanted to write several potential renderers to test what gave the best performance. Each simply takes a grid as an array and draws it.

const renderer = RectangleRenderer;
render = new renderer();
let grid = new Grid();

function iterate() {
  grid.update();
  render.render(grid.dump());
  grid.swap();
  requestAnimationFrame(iterate);
}

var ctx;
$(document).ready(function() {
ctx = $("#game-view")[0].getContext("2d");
iterate();
});

Initial Optimisations

Ok so, one problem. With a 256×256 grid the initial start-up takes up to 9 seconds. Some quick JS Profiling in chrome seemed to indicate that the culprit was the init() function.

The init() function was seeding the grid by iterating over the entire grid’s X/Y coordinates and filling each cell with a random 0 or 1. But in order to do this, the Math.random() function was being called for each cell (meaning it needs to be called 65535 times) as the following:

  cell = Math.floor(Math.random()*2)

As it turns out this is extremely wasteful. In theory, each call to Math.random() could be used to supply up to 32 unique bits per call, which would reduce the number of calls significantly. After implementation, the init() function began to run almost instantaneously:

function random_byte() {
  let el = Math.round(Math.random() * (Number.MAX_SAFE_INTEGER/2) + Number.MAX_SAFE_INTEGER/2);
  let bits = [];
  static_counter += 1;

  while (el > 1) {
    let bit = Math.floor(el % 2);
    bits.push(bit);
    el = el / 2;
  }

  return bits;
}

function fast_random(size) {
  let array = [];

  while (array.length < size) {
    let c = random_byte();
    array = array.concat(c);
  }

  return array.slice(0,size);
}

With this function added, the initial delay disappeared entirely for a 256x256 grid. In the next article I'm going to look at some of the algorithms used in speeding up the Game of Life as an implementation level.

Ruby’s Array#map_with_index and each_with_index in Javascript

#each_with_index and #map_with_index are commonly used Ruby features that can be easily achieved in JavaScript

Each with index

In Ruby we can do this:

x = ["a","b","c","d","e"]

x.each_with_index { |el,n| puts "#{n} is #{el}" }

Output:

0 is a
1 is b
2 is c
3 is d
4 is e

In Javascript it’s possible to do something very similar. The documentation for JavaScript’s Array.forEach function states that the three arguments passed to the callback function are:

callback is invoked with three arguments:

the element value
the element index
the array being traversed

Therefore we can do this:

let array = ["a","b","c","d","e","f"];

array.forEach((x,n) => (console.log(x + " is " + n)));
a is 0
b is 1
c is 2
d is 3
e is 4
f is 5

Map with index

In Ruby it’s possible to do this:

x = ["a","b","c","d","e","f"]

z = x.map.with_index { |el,n| (el.ord + n).chr }

puts z

Output:

a
c
e
g
i
k

Similar to with forEach, Map has the following arguments passed to its function argument:

currentValue

The current element being processed in the array.

indexOptional

The index of the current element being processed in the array.

arrayOptional

The array map was called upon.

Which makes it possible to do this:

let array = ["a","b","c","d","e","f"];

let new_array = array.map((x,n) => String.fromCharCode(x.charCodeAt(0)+n));

console.log(new_array);

Output

[ 'a', 'c', 'e', 'g', 'i', 'k' ]

Pwnable.kr CTF writeup: level 6 – random

This one turned out to be surprisingly easy. What’s required is an understanding of random number generators, and how they’re not really random. Implementations of random number generators are known as PRNGs, acknowledging that they are pseudo random number generators. They use various tricks in order to produce streams of not obviously connected numbers, but without an understanding of how they’re meant to work it can be easy to produce a problem similar to the one described in this CTF.

Code

#include

int main(){
unsigned int random;
random = rand(); // random value!

unsigned int key=0;
scanf("%d", &key);

if( (key ^ random) == 0xdeadbeef ){
printf("Good!\n");
system("/bin/cat flag");
return 0;
}

printf("Wrong, maybe you should try 2^32 cases.\n");
return 0;
}

Pretty simple to understand, we have two variables, random and key. A decimal value is read into key through scanf, and rand() assigns a “random” number to the variable ‘random’. We then XOR the values and if the result is 0xdeadbeef we give up the flag. In theory, this should be hard to beat. Wouldn’t we have to keep trying values until we happen to get the right one? And with such a large problem space, that could take a very long time.

Key Insight

The most important thing to recognise is that this is incorrect use of the rand() function. Using PRNG correctly involves “seeding” the generator with a value that changes frequently, a common tactic is to use the current timestamp supplied to srand (there are some issues with this too, see if you can figure them out). This leads to predictable behaviour, which becomes a problem when knowing the random value allows us to exploit the behaviour. A list of historical examples of where this has been used in real-world attacks can be found on wikipedia’s page on random number generator attacks

Since the rand() call isn’t seeded, it’s fairly trivial to check the value of a few runs of the program in GDB. Setting a breakpoint for the next line after the call to rand() makes it easy to observe what’s returned from the function into the RAX register (the one used to store return values traditionally). A few runs through and it’s easy to see that this value doesn’t change. Due to failure to seed the PRNG it’s returning the exact same value each time, which happens to be 0x6b8b4567. Now we have the following simple equation to solve: something XOR 0x6b8b4567 = 0xdeadbeef. The value we’re looking for will be 0xdeadbeef XOR 0x6b8b4567, which ends up being = 0xb526fb88 which is 3039230856. Let’s try it:

random@ubuntu:~$ ./random 
3039230856
Good!
Mommy, I thought libc random is unpredictable...

Pwnable.kr CTF walkthrough: level 1 fd

Pwnable.kr contains a set of CTF challenges ranging from beginner to advanced. I’ve been working through them to increase my understanding of binary exploitation and Reverse Engineering, and I’ll be posting walkthroughs for ones I’ve solved

Logging in to the machine specified, we see a banner and have a bash shell, and listing the contents of the homedir we can see we have three files, a binary, a C source file (of the binary) and the flag file.

$ ssh fd@pwnable.kr -p2222
fd@pwnable.kr's password: 
 ____  __    __  ____    ____  ____   _        ___      __  _  ____  
|    \|  |__|  ||    \  /    ||    \ | |      /  _]    |  |/ ]|    \ 
|  o  )  |  |  ||  _  ||  o  ||  o  )| |     /  [_     |  ' / |  D  )
|   _/|  |  |  ||  |  ||     ||     || |___ |    _]    |    \ |    / 
|  |  |  `  '  ||  |  ||  _  ||  O  ||     ||   [_  __ |     \|    \ 
|  |   \      / |  |  ||  |  ||     ||     ||     ||  ||  .  ||  .  \
|__|    \_/\_/  |__|__||__|__||_____||_____||_____||__||__|\_||__|\_|
                                                                     
- Site admin : daehee87.kr@gmail.com
Last login: Sat Feb 17 06:54:22 2018 from 121.125.207.90
fd@ubuntu:~$ ls
fd  fd.c  flag
fd@ubuntu:~$ 

Checking the ownership of the files yields the following:

fd@ubuntu:~$ ls -l
total 16
-r-sr-x--- 1 fd_pwn fd   7322 Jun 11  2014 fd
-rw-r--r-- 1 root   root  418 Jun 11  2014 fd.c
-r--r----- 1 fd_pwn root   50 Jun 11  2014 flag

So we know that the ‘fd’ executable has the SUID bit set, and will run as fd_pwn. We can’t access the flag directly, as it’s owned by fd_pwn, but presumably we can subvert the execution of the fd binary and access the flag file as the user that owns it. A quick look at the source of fd yields a contrived exploitable example, but one that still requires a couple of insights to solve:

#include 
#include 
#include 
char buf[32];
int main(int argc, char* argv[], char* envp[]){
        if(argc<2){
                printf("pass argv[1] a number\n");
                return 0;
        }
        int fd = atoi( argv[1] ) - 0x1234;
        int len = 0;
        len = read(fd, buf, 32);
        if(!strcmp("LETMEWIN\n", buf)){
                printf("good job :)\n");
                system("/bin/cat flag");
                exit(0);
        }
        printf("learn about Linux file IO\n");
        return 0;

}

The important call here is to read(), where data is read into a buffer from a file descriptor obtained by converting the user supplied string to integer and subtracting 0x1234 (4660 in decimal). If you're unfamiliar with linux's file descriptors read this first. The key insight is that processes in linux have numerical file descriptors assigned to them, by default each has 3 (0, 1 and 2) corresponding to STDIN, STDOUT and STDERR. In theory, if we can make the read() call obtain its data from the file descriptor 0, it can be tricked into obtaining data from standard input and placing it in the character buffer 'buf'. We can then simply supply the string LETMEWIN on the command line and obtain the flag. Supplying 4660 as the user input results in the fd being 0, and prompts for a string. And we're done!

fd@ubuntu:~$ ./fd 4660
LETMEWIN
good job :)
mommy! I think I know what a file descriptor is!!