[PATCH v5 0/4] Improve abbreviation disambiguation
To
git@vger.kernel.org
Cc
peff@peff.net
gitster@pobox.com
stolee@gmail.com
Derrick Stolee
From
Derrick Stolee
See Also
Prev
Date
2017-10-12 12:02:16 UTC
Changes since previous version:

* Make 'pos' unsigned in get_hex_char_from_oid()

* Check response from open_pack_index()

* Small typos in commit messages 

Thanks,
 Stolee

---

When displaying object ids, we frequently want to see an abbreviation
for easier typing. That abbreviation must be unambiguous among all
object ids.

The current implementation of find_unique_abbrev() performs a loop
checking if each abbreviation length is unambiguous until finding one
that works. This causes multiple round-trips to the disk when starting
with the default abbreviation length (usually 7) but needing up to 12
characters for an unambiguous short-sha. For very large repos, this
effect is pronounced and causes issues with several commands, from
obvious consumers `status` and `log` to less obvious commands such as
`fetch` and `push`.

This patch improves performance by iterating over objects matching the
short abbreviation only once, inspecting each object id, and reporting
the minimum length of an unambiguous abbreviation.

Add a new perf test for testing the performance of log while computing
OID abbreviations. Using --oneline --raw and --parents options maximizes
the number of OIDs to abbreviate while still spending some time
computing diffs. Below we report performance statistics for perf test
4211.6 from p4211-line-log.sh using three copies of the Linux repo:

| Packs | Loose  | Base Time | New Time | Rel%  |
|-------|--------|-----------|----------|-------|
|  1    |      0 |   41.27 s |  38.93 s | -4.8% |
| 24    |      0 |   98.04 s |  91.35 s | -5.7% |
| 23    | 323952 |  117.78 s | 112.18 s | -4.8% |

Derrick Stolee (4):
  p4211-line-log.sh: add log --online --raw --parents perf test
  sha1_name: unroll len loop in find_unique_abbrev_r
  sha1_name: parse less while finding common prefix
  sha1_name: minimize OID comparisons during disambiguation
 sha1_name.c              | 135 +++++++++++++++++++++++++++++++++++++++++------
 t/perf/p4211-line-log.sh |   4 ++
 2 files changed, 123 insertions(+), 16 deletions(-)

-- 
2.14.1.538.g56ec8fc98.dirty