commit adb45dd84c48f8f556be41c7b8301be5575d684b Author: Nicolas MASSE Date: Wed Mar 6 21:33:39 2019 +0100 initial commit diff --git a/DESIGN.md b/DESIGN.md new file mode 100644 index 0000000..fa3f3ce --- /dev/null +++ b/DESIGN.md @@ -0,0 +1,131 @@ +# Design Principles + +## Context + +*URL shortening is a technique in which a URL may be made substantially shorter and still direct to the required page*, +[tells us Wikipedia](https://en.wikipedia.org/wiki/URL_shortening). URL Shorteners are ubiquitous but nevertheless have +some drawbacks. + +- They are run by big companies that can collect usage data and thus impair privacy. +- Public URL shortening services usually bind their users with an EULA. +- There is no guarantee that short URLs from a public service will be kept active. + In fact, they can disappear at anytime if the backing company goes bankrupt. + +And last but not least, all URL shorteners, be it a public service or a +self-hosted one will complicate the tasks of the web archeologist that will +try to re-construct the history of the Web from the available evidences in ten, +fifty or hundred years. + +All those URL shortener keep their mapping table (short URL -> long URL) private +and this means the user (the one accessing the short URL, not the one creating it) +is completely dependent from the URL shortening service. + +## Project goals + +This project strives to provide a URL shortening service, that: + +- is **public and transparent**: the mapping table (short URL -> long URL) is + public and versioned in a GIT repository. Everyone can fork it, keep it in + a safe place or update it with his own URLs. +- is **self-hostable** and respects the user **privacy**: instead of a couple + of big instances of the **URL shortening service**, we would like to + encourage smaller instances that would drive significantly less traffic + and less temptation to drive income from usage data. + +## Technical Design + +### A GIT Repository to store the mapping table + +The mapping table is stored in a GIT repository as a file or collection of files. +The file(s) contains the mapping table in a format that is easy to write for a +human, easily parseable by the machine and "merge-friendly". + +The user wanting to add a short URL to the mapping table could: + +- Fork the repository containing the mapping table +- Add his mappings to the table +- Create a branch with his modifications +- Commit his changes +- Push them to his own fork +- Submit a Pull Request to ask for inclusion in the main mapping table + +The Pull Request could be subject to approval, review, etc. + +## Auto-generated vs custom code + +Usual URL Shorteners can generate a random short code or take a custom code +from the user. + +In order to be as stateless as possible, the generated short code cannot be +random but needs to be generated deterministically. It can be a hash from the +input URL for instance. + +## Hashing algorithm + +URL Shorteners such as bit.ly use a combination of lower case, upper case +letters plus digits to generate a random code. The code is seven characters +long. + +This translates to about 41 bits of entropy: + +```raw +$ echo 'l(62^7)/l(2)' |bc -l +41.67937417270812646207 +``` + +To implement a similar mechanism but fully deterministic, the SHA256 algorithm +is used to hash the target URL and the first six bytes are encoded in base64. + +The short code for `https://framasoft.org/` is computed as such: + +```raw +$ echo -n https://framasoft.org/ | openssl dgst -sha256 -binary |head -c6 |openssl base64 -e +t0P0JMya +``` + +## File Format of the mapping table + +The mapping table uses YAML as file format it fits all the requirements. + +A sample mapping table looks like: + +```yaml +--- +base_url: https://short.code/ +mapping: +- url: https://framasoft.org/ +- url: https://www.gnu.org/ + short-code: gnu-home +``` + +This sample table defines two entries: + +- `https://short.code/t0P0JMya` that maps to `https://framasoft.org/` +- `https://short.code/gnu-home` that maps to `https://www.gnu.org/` + +## Deployment + +The app is packaged and deployed as a container. In this case, we need +to take into account that the filesystem might be read-only. Updates of the +mapping table comes with a new deployment of an updated image of the container. + +In order to achieve rolling updates without service interruption, a health probe +needs to be implemented. + +When the app is deployed outside of a container, an update of the mapping table +can be triggered by a `git pull` (from a crontab for instance). The app needs +to monitor the file containing the mapping table and hot reload the file once +modifications are detected. + +## Coding principles + +This app follows the [12 factors](https://12factor.net/). + +## Minimal Viable Product + +The MVP of this project has the following features: + +- reads only one mapping table +- serves the requests for only one domain +- supports auto-generated and custom codes +- packaged as a container diff --git a/README.md b/README.md new file mode 100644 index 0000000..7886159 --- /dev/null +++ b/README.md @@ -0,0 +1,12 @@ +# YAUS: Yet Another URL Shortener + +YAUS is a URL shortener, akin to bit.ly and others but with radical design +principles. + +For now, there is no code but you can have a look at our +[design principles](DESIGN.md). + +## Authors + +- Mathieu Demange ([@sigmate](/sigmate)) +- Nicolas Massé ([@nmasse-itix](/nmasse-itix))