You've probably noticed that WebGL is really low level and it doesn't even support rendering text. It is mostly triangles with textures all the way down.
Text rendering has vast amount of trade-offs that have to be made in order to get your string to the screen (mainly: amount of characters in your alphabet, is it possible to rotate it or scroll without losing quality, is it possible to generate different font sizes, how much text will the GPU take and so on).
So to avoid making hard decisions first, I will first show you how to parse a TTF file. Why would you want to do that? Depending on the rendering method you pick, later on, it is highly likely that you will need information about size and spacing of each character (unless you go with the monospace font but let's keep the challenge real). Of course, the font file also contains everything that will ever be needed to render given font. There is a table containing glyphs as a combination of several bezier curves, but I will not go into that since that would be a much deeper dive.
Prepare for some terse writing and bigger than usual blocks of code. I want to cover each line needed to go from having a TTF file on your hard drive to a JSON containing spacing information allowing us to render beautiful text.
TTF is a binary file format meant to be as lightweight as possible. This means that it comes as a sequence of numbers that are just values and you know which value you are looking at by knowing their order.
I will follow with a step by step diary of how I managed to read a TTF file (specifically: font spacing information) in a bunch of NodeJS scripts.
Inspired by this article about parsing TTF files in JavaScript, I got the parser below to work. It can navigate around the Buffer
with the content of the *.ttf
file and read some meaningful values.
Classic fs
module, promisified a bit.
const readFile = (fileName: string): Promise<Buffer> =>
new Promise((resolve, reject) => {
fs.readFile(fileName, (error, data) => {
if (error) {
reject(error)
return
}
resolve(data)
})
})
Then I am just reading the file and providing it to my custom binary module.
const buffer = await readFile(input)
const reader = binaryFile(buffer)
Which looks like this:
const binaryFile = buffer => {
const data = new Uint8Array(buffer)
let position = 0
const getUint8 = () => data[position++]
const getUint16 = () => ((getUint8() << 8) | getUint8()) >>> 0
const getUint32 = () => getInt32() >>> 0
const getInt16 = () => {
let number = getUint16()
if (number & 0x8000) number -= 1 << 16
return number
}
const getInt32 = () =>
(getUint8() << 24) | (getUint8() << 16) | (getUint8() << 8) | getUint8()
const getFWord = getInt16
const getUFWord = getUint16
const getOffset16 = getUint16
const getOffset32 = getUint32
const getF2Dot14 = () => getInt16() / (1 << 14)
const getFixed = () => getInt32() / (1 << 16)
const getString = length => {
let string = ''
for (let i = 0; i < length; i++) {
string += String.fromCharCode(getUint8())
}
return string
}
const getDate = () => {
const macTime = getUint32() * 0x100000000 + getUint32()
const utcTime = macTime * 1000 + Date.UTC(1904, 1, 1)
return new Date(utcTime)
}
const getPosition = () => position
const setPosition = targetPosition => (position = targetPosition)
return {
getUint8,
getUint16,
getUint32,
getInt16,
getInt32,
getFWord,
getUFWord,
getOffset16,
getOffset32,
getF2Dot14,
getFixed,
getString,
getDate,
getPosition,
setPosition,
}
}
The TrueType font files are structured in tables. The file starts with a master table defining how many tables are there and when do they start and the rest of the file is those tables following strictly specified format.
Microsoft's specification of OpenType format was a priceless help to me when I was finding out what meant what.
As we can read in the section about file organization, the file starts with 5 numbers. The one we need is numTables
which tells us how many tables we can look for.
reader.getUint32() // scalarType
const numTables = reader.getUint16()
reader.getUint16() // searchRange
reader.getUint16() // entrySelector
reader.getUint16() // rangeShift
Note that I am 'reading' all numbers to put the cursor in the correct place. After the initial header, we can read information about each table in the file.
const tables = {}
for (let i = 0; i < numTables; i++) {
const tag = reader.getString(4)
tables[tag] = {
checksum: reader.getUint32(),
offset: reader.getUint32(),
length: reader.getUint32(),
}
}
The first table that will tell us a lot about the rest of the file is called head
. Reading it is pretty straightforward as we just need to follow the docs.
const head = {
majorVersion: reader.getUint16(),
minorVersion: reader.getUint16(),
fontRevision: reader.getFixed(),
checksumAdjustment: reader.getUint32(),
magicNumber: reader.getUint32(),
flags: reader.getUint16(),
unitsPerEm: reader.getUint16(),
created: reader.getDate(),
modified: reader.getDate(),
xMin: reader.getFWord(),
yMin: reader.getFWord(),
xMax: reader.getFWord(),
yMax: reader.getFWord(),
macStyle: reader.getUint16(),
lowestRecPPEM: reader.getUint16(),
fontDirectionHint: reader.getInt16(),
indexToLocFormat: reader.getInt16(),
glyphDataFormat: reader.getInt16(),
}
We will especially need unitsPerEm
to convert FUnits (font units, a measuring system used in TTF) to pixels on the screen.
Then indexToLocFormat
will tell us which format to use when reading information about glyphs.
Going around the spec, we can find out several things about the tables:
maxp
will tell us how many glyphs there are in the filecmap
tells us the mapping between character codes and glyph indices used throught the font fileglyf
provides xMin
, yMin
, xMax
, yMax
loca
knows offsets of glyphs in the glyf
tablehmtx
contains information about leftSideBearing
(which is how far each character wants to be from the previous one) and advanceWidth
which is how much horizontal space it claims for itselfhhea
will tell us how many horizontal metrics there are defined in the hmtx
table (it doesn't have to be one for each character)How each table depends on the others will partially force the order of parsing.
maxp
This one will be another easy to parse as it is just a sequence of vars.
const maxp = {
version: reader.getFixed(),
numGlyphs: reader.getUint16(),
maxPoints: reader.getUint16(),
maxContours: reader.getUint16(),
maxCompositePoints: reader.getUint16(),
maxCompositeContours: reader.getUint16(),
maxZones: reader.getUint16(),
maxTwilightPoints: reader.getUint16(),
maxStorage: reader.getUint16(),
maxFunctionDefs: reader.getUint16(),
maxInstructionDefs: reader.getUint16(),
maxStackElements: reader.getUint16(),
maxSizeOfInstructions: reader.getUint16(),
maxComponentElements: reader.getUint16(),
maxComponentDepth: reader.getUint16(),
}
This one is necessary just for the numOfLongHorMetrics
value.
const hhea = {
version: reader.getFixed(),
ascent: reader.getFWord(),
descent: reader.getFWord(),
lineGap: reader.getFWord(),
advanceWidthMax: reader.getUFWord(),
minLeftSideBearing: reader.getFWord(),
minRightSideBearing: reader.getFWord(),
xMaxExtent: reader.getFWord(),
caretSlopeRise: reader.getInt16(),
caretSlopeRun: reader.getInt16(),
caretOffset: reader.getFWord(),
}
// Skip 4 reserved places.
reader.getInt16()
reader.getInt16()
reader.getInt16()
reader.getInt16()
hhea.metricDataFormat = reader.getInt16()
hhea.numOfLongHorMetrics = reader.getUint16()
Thanks to the hhea
, we now know how many hMetrics
are there.
const hMetrics = []
for (let i = 0; i < hhea.numOfLongHorMetrics; i++) {
hMetrics.push({
advanceWidth: reader.getUint16(),
leftSideBearing: reader.getInt16(),
})
}
const leftSideBearing = []
for (let i = 0; i < maxp.numGlyphs - hhea.numOfLongHorMetrics; i++) {
leftSideBearing.push(reader.getFWord())
}
const hmtx = {
hMetrics,
leftSideBearing,
}
Translates index to location, so it basically tells us how much we should advance in glyf
table.
The interesting thing we should note here is that loca
comes in two flavours.
It is either short version, with Offset16
which is the actual offset value divided by 2 or the long version, with Offset32
which stores the real offset.
Both contain numGlyphs + 1
values because the first character is special 'missing char'.
const getter =
head.indexToLocFormat === 0 ? reader.getOffset16 : reader.getOffset32
const loca = []
for (let i = 0; i < numGlyphs + 1; i++) {
loca.push(getter())
}
Now we can use the loca
table along with the indexToLocFormat
to read glyphs.
const glyf = []
for (let i = 0; i < loca.length - 1; i++) {
const multiplier = head.indexToLocFormat === 0 ? 2 : 1
const locaOffset = loca[i] * multiplier
reader.setPosition(offset + locaOffset)
glyf.push({
numberOfContours: reader.getInt16(),
xMin: reader.getInt16(),
yMin: reader.getInt16(),
xMax: reader.getInt16(),
yMax: reader.getInt16(),
})
}
Cmap is probably the hardest one to read here. It contains information about mapping unicode char code -> glyph index
which means that finally, we will be able to know which character is represented by what values.
const cmap = {
version: reader.getUint16(),
numTables: reader.getUint16(),
encodingRecords: [],
glyphIndexMap: {},
}
We will stick to the version
0.
if (cmap.version !== 0) {
throw new Error(`cmap version should be 0 but is ${cmap.version}`)
}
The file starts with the definition of encodings.
for (let i = 0; i < cmap.numTables; i++) {
cmap.encodingRecords.push({
platformID: reader.getUint16(),
encodingID: reader.getUint16(),
offset: reader.getOffset32(),
})
}
Now we can use that information to find out if it is something we want to parse. There are so many formats that it doesn't make sense to support them all, especially that even libraries specializing in that usually focus on one or two variants.
let selectedOffset = -1
for (let i = 0; i < cmap.numTables; i++) {
const { platformID, encodingID, offset } = cmap.encodingRecords[i]
const isWindowsPlatform =
platformID === 3 &&
(encodingID === 0 || encodingID === 1 || encodingID === 10)
const isUnicodePlatform =
platformID === 0 &&
(encodingID === 0 ||
encodingID === 1 ||
encodingID === 2 ||
encodingID === 3 ||
encodingID === 4)
if (isWindowsPlatform || isUnicodePlatform) {
selectedOffset = offset
break
}
}
if (selectedOffset === -1) {
throw new Error(
"The font doesn't contain any recognized platform and encoding."
)
}
So basically what we did here is figuring out if the font contains format 4 definition and now we will abort if it doesn't.
const format = reader.getUint16()
if (format === 4) {
cmap.glyphIndexMap = parseFormat4(reader).glyphIndexMap
} else {
throw new Error(`Unsupported format: ${format}. Required: 4.`)
}
This one is standard character-to-glyph-index mapping table for the Windows platform for fonts that support Unicode BMP characters, as Microsoft's spec says.
Let's just start parsing and later on I will describe what does what.
const readFormat4 = buffer => {
Starting with a function.
const format4 = {
format: 4,
length: reader.getUint16(),
language: reader.getUint16(),
segCountX2: reader.getUint16(),
searchRange: reader.getUint16(),
entrySelector: reader.getUint16(),
rangeShift: reader.getUint16(),
endCode: [],
startCode: [],
idDelta: [],
idRangeOffset: [],
glyphIndexMap: {}, // This one is my addition, contains final unicode->index mapping
}
For some reason, segment count is stored doubled. That's why there's X2
appended to its name.
const segCount = format4.segCountX2 >> 1
for (let i = 0; i < segCount; i++) {
format4.endCode.push(reader.getUint16())
}
reader.getUint16() // Reserved pad.
for (let i = 0; i < segCount; i++) {
format4.startCode.push(reader.getUint16())
}
for (let i = 0; i < segCount; i++) {
format4.idDelta.push(reader.getInt16())
}
const idRangeOffsetsStart = reader.getPosition()
for (let i = 0; i < segCount; i++) {
format4.idRangeOffset.push(reader.getUint16())
}
Now that we've read all the information, the hard part comes.
So to introduce it a bit, cmap
table is based on segments. Segments are contiguous ranges of character codes to allow the font to define only a subset of Unicode characters.
Each segment is described by startCode
and endCode
. It also has corresponding idDelta
and idRangeOffset
that are used for mapping characters to codes in the given segment.
The table was designed to help search inside it, so, for example, the last segment is a special one with 0xFFFF
as both its start and end code. But we will not use that since we are just producing JS object mapping.
for (let i = 0; i < segCount - 1; i++) {
let glyphIndex = 0
const endCode = format4.endCode[i]
const startCode = format4.startCode[i]
const idDelta = format4.idDelta[i]
const idRangeOffset = format4.idRangeOffset[i]
So for each segment, we are looking for endCode - startCode
mappings.
Follow this part of the spec: Format 4 segment mapping to delta values to get a detailed description of what is about to happen.
As specification states, if idRangeOffset
in the given segment was 0
, our job becomes trivial and glyphIndex
equals (c + idDelta) & 0xffff
, where & 0xffff
is simply modulo 65536
as the spec requires.
If idRangeOffset
is not 0
, the spec provides the following formula for calculating the places:
glyphId = *(idRangeOffset[i]/2
+ (c - startCode[i])
+ &idRangeOffset[i])
It roughly translates to our JavaScript code, with few exceptions. First – we need to multiply (c - startCode)
by two since all values here are two bytes big.
for (let c = startCode; c < endCode; c++) {
if (idRangeOffset !== 0) {
const startCodeOffset = (c - startCode) * 2
const currentRangeOffset = i * 2 // 2 because the numbers are 2 byte big.
let glyphIndexOffset =
idRangeOffsetsStart + // where all offsets started
currentRangeOffset + // offset for the current range
idRangeOffset + // offset between the id range table and the glyphIdArray[]
startCodeOffset // gets us finally to the character
reader.setPosition(glyphIndexOffset)
glyphIndex = reader.getUint16()
if (glyphIndex !== 0) {
// & 0xffff is modulo 65536.
glyphIndex = (glyphIndex + idDelta) & 0xffff
}
} else {
glyphIndex = (c + idDelta) & 0xffff
}
format4.glyphIndexMap[c] = glyphIndex
}
}
And finally returning the format4
for usage in cmap
.
return format4
Now that we have everything parsed, we can gather spacing information.
The code for that is trivial, maybe with one exception. The rightSideBearing
is calculated by subtracting from advanceWidth
but it shouldn't be hard to grasp either.
const spacing = ttf => {
const alphabet =
' !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~'
const map = {}
alphabet.split('').forEach(char => {
const index = ttf.glyphIndexMap[char.codePointAt(0) || 0] || 0
const glyf = ttf.glyf[index]
const hmtx = ttf.hmtx.hMetrics[index]
map[char] = {
x: glyf.xMin,
y: glyf.yMin,
width: glyf.xMax - glyf.xMin,
height: glyf.yMax - glyf.yMin,
lsb: hmtx.leftSideBearing,
rsb: hmtx.advanceWidth - hmtx.leftSideBearing - (glyf.xMax - glyf.xMin),
}
})
return map
}
For example, let's have a look at Q
character.
"Q": {
"x": 168,
"y": -192,
"width": 1808,
"height": 2268,
"lsb": 168,
"rsb": 168
},
What you might find interesting is the range this unit takes. The value is quite high compared to pixels. So simply speaking, it is using a unit called FUnit
(font unit) which encodes each glyph on a plane ranging on two-byte numbers. To translate it to pixels, we must know how many units per em are used in the font we are looking at. You can read it from head
table.
The formula looks like this:
const scale = (1 / unitsPerEm) * fontSizeInPixels
You can use them to scale those values by multiplying them by the scale
factor.
In order to test results of our work, we can render shapes of the characters in canvas
. Calculating the rectangles containing glyphs looks as follows:
let positionX = 0
let rectangles = []
for (let i = 0; i < text.length; i++) {
// Assuming spacing is an object containing information like presented above.
const { x, y, width, height, lsb, rsb } = spacing[text[i]]
rectangles.push({
x: positionX + (x + (i !== 0 ? lsb : 0)) * scale,
y: 48 - (y + height) * scale,
width: width * scale,
height: height * scale,
})
positionX += ((i !== 0 ? lsb : 0) + width + rsb) * scale
}
Note that I am skipping leftSideBearing
when i
is 0
(as I've noticed looking at the text rendered by canvas
). And positionX
advances by the written character as well as rightSideBearing
.
Getting it to the screen:
rectangles.forEach(r => context.fillRect(r.x, r.y, r.width, r.height))
Note that we have only touched a small subset of possible TTF tables. It is sufficient to render many fonts, for example Inter, but might be not enough for others.
We didn't touch kerning or GPOS (glyph positioning table) so some fonts might be calculated totally wrong. But you get the idea of how it works and you should be able to add anything you need with help of Microsoft OTF spec.
Microsoft OpenType spec – great source of knowledge about data types used within the format.
Let's read a Truetype font file from scratch – great tutorial on reading TTF files that allowed me to get started.
<- home