That is how basic Lempel-Ziv compression works, which is one of the two fundamental techniques used in data compression. You keep a history of what you've already decompressed, and if you find something you already have, you instead encode a pointer to it. Variants of this (LZW, LZ77, LZSS, and a bajillion custom variants, because game developers like to reinvent their own compression) have been used in many, many games.
The other is entropy coding, which means re-writing bytes and pointer codes into variable length values that are optimized by frequency: fewer bits for common codes, more bits for uncommon ones. The most basic approach for this is Huffman coding, which is kind of like variable length binary phone numbers.
Together, these two techniques are how DEFLATE compression works, which is the algorithm used by zip, gzip, and zlib. zlib is then used in many, many formats, like PNG and PDF.
Trivial "look for dupes and subsets and just change pointers" compression is already standard when building C code. If your program has two strings, "foobar" and "bar", the linker will just store "foobar" in the strings section and point 3 bytes into it for "bar", assuming the right compilation/linking options to make it work.
The other is entropy coding, which means re-writing bytes and pointer codes into variable length values that are optimized by frequency: fewer bits for common codes, more bits for uncommon ones. The most basic approach for this is Huffman coding, which is kind of like variable length binary phone numbers.
Together, these two techniques are how DEFLATE compression works, which is the algorithm used by zip, gzip, and zlib. zlib is then used in many, many formats, like PNG and PDF.
Trivial "look for dupes and subsets and just change pointers" compression is already standard when building C code. If your program has two strings, "foobar" and "bar", the linker will just store "foobar" in the strings section and point 3 bytes into it for "bar", assuming the right compilation/linking options to make it work.