README 640 B

123456789101112131415
  1. The Python scripts are for generating test data, because Python's Unicode
  2. support is much, much, much, much better than PHP's.
  3. * `utf-8/urlencode.py`, `utf-8/u-urlencode.py` and `utf-8/entitize.py` process UTF-8
  4. into a few different formats (%-encoding, %u-encoding, &#decimal;)
  5. and are used like normal UNIXy pipes.
  6. Try:
  7. `python urlencode.py < utf-8.txt > urlencoded.txt`
  8. `python u-urlencode.py < utf-8.txt > u-urlencoded.txt`
  9. `python entitize.py < utf-8.txt > entitized.txt`
  10. * `windows-1252.py` converts Windows-only smart-quotes and things
  11. into their unicode &#decimal reference; equivalents.