Re: [PATCH v5] gitweb: redacted e-mail addresses feature.
To
Georgios Kontaxis
Cc
Georgios Kontaxis via GitGitGadget
git@vger.kernel.org
"Ævar Arnfjörð Bjarmason"
brian m. carlson
From
Eric Wong
See Also
Prev Ref 1 Ref 2 Ref 3
Date
2021-04-08 17:16:48 UTC
Georgios Kontaxis <geko1702+commits@99rst.org> wrote:
> > Georgios Kontaxis via GitGitGadget <gitgitgadget@gmail.com> wrote:
> >> Introduce an 'email-privacy' feature which redacts e-mail addresses
> >> from the generated HTML content
> >
> Eric Wong wrote:
> > A general reply to the topic: have you considered munging
> > addresses in a way that is still human readable, but obviously
> > obfuscated?
> >
> > On some other project, I settled on HTML "&#8226;" as a replacement
> > for '.' for admins who enable that option.  The $USER@$NO_DOT
> > remains as-is for easy identification+recognition of hosts.
> >
> Thanks for the suggestion.
> 
> People have been trying to hinder address harvesting for a while now.
> Replacing '@' with "at", the dot with "dot", adding spaces, etc.
> was pretty common at some point. May still be.
> I would expect crawlers to have caught up and this includes
> all sorts of character encodings and unicode look-alike substitutions.

I figure the crawlers hit a combinatorial explosion and
give up since they'd be wasting time with false-positives.

> > I also considered Unicode homographs which can look identical
> > to replacement characters, too; but rejected that idea since
> > it would cause grief for legitimate users who would not notice
> > the homograph when pasting into their mail client.

As a data point, none of the homograph@ candidates I posted here
on Mar 29 have attracted any attempts on my mail server.