mirror of https://github.com/nmasse-itix/yaus.git
commit
adb45dd84c
2 changed files with 143 additions and 0 deletions
@ -0,0 +1,131 @@ |
|||||
|
# Design Principles |
||||
|
|
||||
|
## Context |
||||
|
|
||||
|
*URL shortening is a technique in which a URL may be made substantially shorter and still direct to the required page*, |
||||
|
[tells us Wikipedia](https://en.wikipedia.org/wiki/URL_shortening). URL Shorteners are ubiquitous but nevertheless have |
||||
|
some drawbacks. |
||||
|
|
||||
|
- They are run by big companies that can collect usage data and thus impair privacy. |
||||
|
- Public URL shortening services usually bind their users with an EULA. |
||||
|
- There is no guarantee that short URLs from a public service will be kept active. |
||||
|
In fact, they can disappear at anytime if the backing company goes bankrupt. |
||||
|
|
||||
|
And last but not least, all URL shorteners, be it a public service or a |
||||
|
self-hosted one will complicate the tasks of the web archeologist that will |
||||
|
try to re-construct the history of the Web from the available evidences in ten, |
||||
|
fifty or hundred years. |
||||
|
|
||||
|
All those URL shortener keep their mapping table (short URL -> long URL) private |
||||
|
and this means the user (the one accessing the short URL, not the one creating it) |
||||
|
is completely dependent from the URL shortening service. |
||||
|
|
||||
|
## Project goals |
||||
|
|
||||
|
This project strives to provide a URL shortening service, that: |
||||
|
|
||||
|
- is **public and transparent**: the mapping table (short URL -> long URL) is |
||||
|
public and versioned in a GIT repository. Everyone can fork it, keep it in |
||||
|
a safe place or update it with his own URLs. |
||||
|
- is **self-hostable** and respects the user **privacy**: instead of a couple |
||||
|
of big instances of the **URL shortening service**, we would like to |
||||
|
encourage smaller instances that would drive significantly less traffic |
||||
|
and less temptation to drive income from usage data. |
||||
|
|
||||
|
## Technical Design |
||||
|
|
||||
|
### A GIT Repository to store the mapping table |
||||
|
|
||||
|
The mapping table is stored in a GIT repository as a file or collection of files. |
||||
|
The file(s) contains the mapping table in a format that is easy to write for a |
||||
|
human, easily parseable by the machine and "merge-friendly". |
||||
|
|
||||
|
The user wanting to add a short URL to the mapping table could: |
||||
|
|
||||
|
- Fork the repository containing the mapping table |
||||
|
- Add his mappings to the table |
||||
|
- Create a branch with his modifications |
||||
|
- Commit his changes |
||||
|
- Push them to his own fork |
||||
|
- Submit a Pull Request to ask for inclusion in the main mapping table |
||||
|
|
||||
|
The Pull Request could be subject to approval, review, etc. |
||||
|
|
||||
|
## Auto-generated vs custom code |
||||
|
|
||||
|
Usual URL Shorteners can generate a random short code or take a custom code |
||||
|
from the user. |
||||
|
|
||||
|
In order to be as stateless as possible, the generated short code cannot be |
||||
|
random but needs to be generated deterministically. It can be a hash from the |
||||
|
input URL for instance. |
||||
|
|
||||
|
## Hashing algorithm |
||||
|
|
||||
|
URL Shorteners such as bit.ly use a combination of lower case, upper case |
||||
|
letters plus digits to generate a random code. The code is seven characters |
||||
|
long. |
||||
|
|
||||
|
This translates to about 41 bits of entropy: |
||||
|
|
||||
|
```raw |
||||
|
$ echo 'l(62^7)/l(2)' |bc -l |
||||
|
41.67937417270812646207 |
||||
|
``` |
||||
|
|
||||
|
To implement a similar mechanism but fully deterministic, the SHA256 algorithm |
||||
|
is used to hash the target URL and the first six bytes are encoded in base64. |
||||
|
|
||||
|
The short code for `https://framasoft.org/` is computed as such: |
||||
|
|
||||
|
```raw |
||||
|
$ echo -n https://framasoft.org/ | openssl dgst -sha256 -binary |head -c6 |openssl base64 -e |
||||
|
t0P0JMya |
||||
|
``` |
||||
|
|
||||
|
## File Format of the mapping table |
||||
|
|
||||
|
The mapping table uses YAML as file format it fits all the requirements. |
||||
|
|
||||
|
A sample mapping table looks like: |
||||
|
|
||||
|
```yaml |
||||
|
--- |
||||
|
base_url: https://short.code/ |
||||
|
mapping: |
||||
|
- url: https://framasoft.org/ |
||||
|
- url: https://www.gnu.org/ |
||||
|
short-code: gnu-home |
||||
|
``` |
||||
|
|
||||
|
This sample table defines two entries: |
||||
|
|
||||
|
- `https://short.code/t0P0JMya` that maps to `https://framasoft.org/` |
||||
|
- `https://short.code/gnu-home` that maps to `https://www.gnu.org/` |
||||
|
|
||||
|
## Deployment |
||||
|
|
||||
|
The app is packaged and deployed as a container. In this case, we need |
||||
|
to take into account that the filesystem might be read-only. Updates of the |
||||
|
mapping table comes with a new deployment of an updated image of the container. |
||||
|
|
||||
|
In order to achieve rolling updates without service interruption, a health probe |
||||
|
needs to be implemented. |
||||
|
|
||||
|
When the app is deployed outside of a container, an update of the mapping table |
||||
|
can be triggered by a `git pull` (from a crontab for instance). The app needs |
||||
|
to monitor the file containing the mapping table and hot reload the file once |
||||
|
modifications are detected. |
||||
|
|
||||
|
## Coding principles |
||||
|
|
||||
|
This app follows the [12 factors](https://12factor.net/). |
||||
|
|
||||
|
## Minimal Viable Product |
||||
|
|
||||
|
The MVP of this project has the following features: |
||||
|
|
||||
|
- reads only one mapping table |
||||
|
- serves the requests for only one domain |
||||
|
- supports auto-generated and custom codes |
||||
|
- packaged as a container |
||||
@ -0,0 +1,12 @@ |
|||||
|
# YAUS: Yet Another URL Shortener |
||||
|
|
||||
|
YAUS is a URL shortener, akin to bit.ly and others but with radical design |
||||
|
principles. |
||||
|
|
||||
|
For now, there is no code but you can have a look at our |
||||
|
[design principles](DESIGN.md). |
||||
|
|
||||
|
## Authors |
||||
|
|
||||
|
- Mathieu Demange ([@sigmate](/sigmate)) |
||||
|
- Nicolas Massé ([@nmasse-itix](/nmasse-itix)) |
||||
Loading…
Reference in new issue