Re: [PATCH v5] gitweb: redacted e-mail addresses feature.
To
Georgios Kontaxis via GitGitGadget
Cc
git@vger.kernel.org
Ævar Arnfjörð Bjarmason
brian m. carlson
Georgios Kontaxis
From
Eric Wong
See Also
Prev Ref 1
Date
2021-03-29 01:47:44 UTC
Georgios Kontaxis via GitGitGadget <gitgitgadget@gmail.com> wrote:
> Gitweb extracts content from the Git log and makes it accessible
> over HTTP. As a result, e-mail addresses found in commits are
> exposed to web crawlers and they may not respect robots.txt.
> This can result in unsolicited messages.

> Introduce an 'email-privacy' feature which redacts e-mail addresses
> from the generated HTML content

A general reply to the topic: have you considered munging
addresses in a way that is still human readable, but obviously
obfuscated?

On some other project, I settled on HTML "&#8226;" as a replacement
for '.' for admins who enable that option.  The $USER@$NO_DOT
remains as-is for easy identification+recognition of hosts.

I also considered Unicode homographs which can look identical
to replacement characters, too; but rejected that idea since
it would cause grief for legitimate users who would not notice
the homograph when pasting into their mail client.

Anyways, here's the list of candidates I tried:

homograph∂80x24.org
homograph@80x24ͺorg
homograph@80x24·org
homograph@80x24•org
homographï¼ 80x24.org
homograph﹫80x24.org

https://en.wikipedia.org/wiki/Ano_Teleia#Similar_symbols
https://en.wikipedia.org/wiki/Enclosed_A

homographⒶ80x24.org
homograph@80x24 org
homograph@80x24․org
homograph@80x24ꓸorg