ඞ

represent any character using only python builtin function calls, no literals github

Python has built-in functions like len(), str(), not(), chr(), etc. This site builds any Unicode character using only those functions: no numbers, no strings, no variables, only one paramater.
Just nested function calls.

For example, chr(max(range(ord(min(str(bytes())))))) evaluates to &.

Try pasting the expression into a python console (or using print() on it in a python file).
After you enter a character, the visualize button works you through the evaluation step by step.

Also (new!) this works for strings too, but we allow multiple parameters.

Right around the whole among us era, it was gaining traction that chr(sum(range(ord(min(str(not())))))) in Python evaluates to ඞ, a unicode character that looks suspiciously like the among us crewmate.
This was an amazing discovery. But (unfortunately), I immediately tried to generalize it.
Could any Unicode character (there's ~160,000) be represented like this?

The rules are, to be clear: (as I made them up):

The final function must take in no parameters, like not().
Each function can only take in one parameter; pow(a,b) isn't allowed.
Represent each Unicode character as a composition of these functions.

Since we only aim to find the Unicode value of the character and then apply chr() to it, the struggle is essentially to find a neat representation for each number 1–160,000.

And the representation MUST be neat, in a sense, since Python won't let you have more than 200 nested parentheses. It seemed like a cool challenge idk

My initial attempt used three tools:

A few initial seeds, like len(bin(len(str(not())))) = 5
sum(range(n)) = n(n-1)/2 — to jump up quickly
max(range(n)) = n-1 — to step back down

The idea: apply sum(range()) a few times to overshoot your target, then decrement back down with max(range()).

For example, to get to 6:

5 = len(bin(len(str(not()))))
10 = sum(range(5)) = sum(range(len(bin(len(str(not()))))))
6 = max(range(max(range(max(range(max(range(10))))))))

This algorithm works, but for bigger numbers it eventually can't fit Python's 200 parentheses limit. In this example, to reach 100, you'd have to use sum(range()) thrice on 5 to get 990, and then do 890 decrements! Even if you have many initial seeds, the quadratic growth is much too sparse to reach arbitrarily large numbers within the parentheses budget.

I went looking for many formulas to get from a number n to f(n), and found many, but credit to Gemini for finding the key! A direct function that turns n into 3n. This is incredibly useful, because it means we can represent each number n in "O(log n)" parentheses. Basically, it's kind of like the algorithm to find a number in base 3, except the opposite direction, since we can only subtract and not add.

Two operations are enough:

Subtract 1: max(range(n)) returns n - 1.

range(n) produces 0, 1, ..., n-1. max() picks the last one. Costs 2 parentheses.

Multiply by 3: len(str(list(bytes(n)))) returns exactly 3n.

bytes(n) creates n zero-bytes. list() turns it into [0, 0, ..., 0]. str() gives "[0, 0, ..., 0]" — always exactly 3n characters. Costs 4 parentheses.

The algorithm: to represent n, decompose it in base 3. At each step, build ceil(n/3), triple it, then subtract the remainder (0, 1, or 2). Stop when you reach a base anchor — a small number you can construct directly (like len(str(not())) = 4). *note: The only base anchors you actually need is 1 = int(not()), 0 = int(not(not())).

function build(n):
    if n is a base anchor:
        return the anchor expression

    q = ceil(n / 3)
    r = 3 * q - n              // r is 0, 1, or 2

    expr = triple(build(q))    // multiply by 3
    expr = subtract(expr, r)   // subtract 0, 1, or 2
    return expr

Example: build(13)

You can think about it this way: 13 = 15-2 = 5*3-2 = (3*2-1)*3-2

build(13):
    q = ceil(13/3) = 5,  r = 15 - 13 = 2
    13 = triple(build(5)) - 2

  build(5):
      q = ceil(5/3) = 2,  r = 6 - 5 = 1
      5 = triple(build(2)) - 1

    build(2):
        2 is a base anchor
        return len(str(ord(min(str(not())))))

  working back up:
    build(5) = triple(len(str(ord(min(str(not())))))) - 1
    build(13) = triple(that) - 2

Then wrap the whole thing in chr() to get the character.

Algorithm stats (base-3 only, no optimizations)

avg depth

average number of function calls per expression

max depth

most function calls needed for any single number

avg length

average character count of the expression string

max length

longest expression string for any single number

Python has a 200 nested parentheses limit. The base-3 algorithm stays well under that for all Unicode code points (max 1,114,111).

The base-3 algorithm works for everything but isn't always the shortest.

The idea: for each number in the database, the optimizer asks "could any strategy + some smaller number produce this more cheaply?" For example, for target 51: the 3x strategy inverts to 51/3 = 17. If wrapping 17's expression in len(str(list(bytes(...)))) is shorter than what we already have for 51, we replace it.

It does this for every number (0–200,000) and every strategy. Each pass can improve expressions, and improvements cascade — if a later pass finds a shorter way to build 17, everything that depends on 17 (like 51) automatically gets shorter too.

Available strategies

Exact multipliers

All based on stringifying bytes objects in different ways:

len(str(list(bytes(n))))	= 3n	4 parens
len(str(bytes(n)))	= 4n + 3	3 parens
len(ascii(str(bytes(n))))	= 5n + 5	4 parens

4n+3 costs only 3 parens (cheaper than 3x), so it's often better when you can land on the right value with the +3 offset.

Zip chain — higher exact multiples via nested tuples

Each zip() wrapper turns each element into a deeper tuple, adding exactly 3n to the string length.

len(str(list(zip(bytes(n)))))	= 6n	5 parens
len(str(list(zip(zip(bytes(n))))))	= 9n	6 parens
...k zips...	= 3(k+1)n	4+k parens

Ascii exponential — exponential multiplier for linear paren cost

str(bytes(n)) produces backslash escapes like \x00. Each ascii() call escapes those backslashes again, doubling them. So the string roughly doubles in length with each wrap:

str(bytes(2))         = "b'\x00\x00'"          len = 11
ascii(that)           = "\"b'\\x00\\x00'\""     len = 15
ascii(ascii(that))    = ...                      len = 23

General formula with k layers of ascii(): f(n) = (2^k + 3)n + (2^(k+1) + 1)

k=1 (one ascii)	5n + 5	4 parens
k=2 (two ascii)	7n + 9	5 parens
k=3	11n + 17	6 parens
k=4	19n + 33	7 parens
k=5	35n + 65	8 parens
k=6	67n + 129	9 parens
k=10	1027n + 2049	13 parens

Triangular jump — quadratic growth for 2 parens

sum(range(n)) = n(n-1)/2

Full list of strategies (many are useless): strategies.py · Base anchors: anchors.py
If you can think of any I'm missing: Submit them on GithHub!

A simple sqllite database stores the shortest known expression for each number from 0 to 200,000. Each entry records which strategy produced it and which smaller number it depends on. So, for example, if we did formula_5(20) = 50, and we found a smaller formula for 20, we automatically plug that in, and get a smaller representation for 50.

Current stats

entries

total numbers stored in the database

avg depth

average number of function calls per expression

max depth

most function calls needed for any single number

avg length

average character count of the expression string

Strategy breakdown

How many numbers use each strategy as their shortest representation, in the final database.

Optimization history

Each row shows the state of the database after a round of improvements. Minimal = only seeds 0 and 1. Full algorithm = adds 44 base anchors. Optimizer = tries every strategy on every number and keeps the shortest. Deep search = same, but allows up to 10 extra decrements to bridge gaps (instead of 2).

	avg depthaverage function calls across all numbers	max depthworst-case function calls for any number	avg lenaverage expression string length

So we've got single characters figured out. But what about whole strings? Python has no builtin concat(a, b). There's no way to join two strings without operators (+), methods (.join()), or syntax ([]). (at least, I really don't think so!)

The single-character system works nicely because it produces a linear AST (abstract syntax tree). Every expression is a straight line: each function wraps the previous one, flowing in one direction without any branches or merges. So, methods that take in no arguments would be better, since they at least keep the AST linear. And you can think of doing x.to_bytes() as basically just doing to_bytes(x). Just looks less nice. Multiple parameters, on the other hand, introduce branches in the AST :(

The giant integer problem

To keep the AST completely linear for a whole string, that would mean uniquely representing each string as a single number. There is a clear way to do this - write the string as a number in base 256, where each byte is a digit (basically, just write the string out as raw bytes and interpret it as an integer). For example, "hello" as a big-endian integer is about 448 billion.

But representing huge numbers with our restricted toolset doesn't scale. CPython physically cannot handle the execution:

Memory wall: The base-3 algorithm uses len(str(list(bytes(n)))) for tripling. Python physically executes bytes(n). For billion-scale numbers, that means allocating gigabytes of RAM at runtime, resulting in an instant MemoryError.
AST wall: Python's parser rejects expressions with more than 200 nested parentheses. Because we build numbers logarithmically, the parenthesis count grows linearly. Past ~8 characters, it hits the 200 limit and throws a SyntaxError.
C-struct wall: Functions like len() and range() are capped at sys.maxsize (2⁶³ - 1). A 10-character string easily exceeds this, breaking any length-based math tricks.

The linear AST approach dies at ~8 characters. The numbers just get too big :(
I really doubt there's a different way to map between strings and integers in a one-to-one way.
There are other issues with this too: for example, you can't do x = b"ඞ" in Python, you get:
SyntaxError: bytes can only contain ASCII literal characters So you'd have to build the bytes one by one, which would require multiple parameters!
Also, to even decode the bytes back into a string, you'd need to specify that you are doing str(x, "u8"), which uses multiple parameters...

Forced to branch

Since massive numbers are impossible, we have to build each byte independently and pack them together. But merging independent branches requires either commas (multi-arg functions) or dot notation (methods). (And dunder methods like .__add__() are outright banned, as they are just operators in disguise).

Here are a few early attempts at merging that didn't make the cut:

str().join([chr(a), chr(b)]) - Relies on [] list syntax. Rejected.
str().join(Exception(chr(a), chr(b)).args) - Avoids [], but introduces both commas AND attribute access. Rejected.

The map(ord) pipeline

This led to horizontal packing using zip():

bytes(map(ord, next(zip(
  chr(b1), chr(b2), ...
)))).decode()

zip() on single-char strings produces a tuple. next() extracts it. map(ord, ...) converts back to integers, bytes() builds the byte string, and .decode() gives the final string.

This works. But passing ord as a bare reference into map() is basically cheating—it's an uncalled function identifier, not the evaluated result of a call. Plus, it still relies on a method (.decode()).

The final solution: the eval() literal pipeline

Instead of building a string and decoding it, we can build the Python source code for the string literal as raw bytes, and let eval() parse it. To get pure integers into zip() without using map(), we use reversed(range(N))—an iterator whose first element is exactly N-1.

eval(bytes(next(zip(
  reversed(range(40)),  # yields 39 (')
  reversed(range(113)), # yields 112 (p)
  reversed(range(122)), # yields 121 (y)
  reversed(range(40))   # yields 39 (')
))))

Step by step for "py":

repr("py") gives 'py'. Its UTF-8 bytes are 39, 112, 121, 39.
For each byte b, generate b+1 with the base-3 algorithm, and wrap it in reversed(range(...)). This yields b as its first element.
zip packs the iterators. next pulls the first element from each, creating the tuple: (39, 112, 121, 39).
bytes naturally consumes the tuple to build b"'py'".
eval accepts bytes natively in Python 3, parses the literal, and returns "py".

So yeah. We end up using multiple paramaters for zip. But no methods. And it still comes out looking pretty cool I think.

py-unicode-golf: