Author Topic: Bitbucket set to remove Mercurial support  (Read 13422 times)

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #100 on: June 15, 2020, 09:53:37 PM »
You need to add your ssh public key to bitbucket. And run HG once by hand (without disabling interactive shell) to accept remote ssh pubkey.
EOS R

Danne

  • Contributor
  • Hero Member
  • *****
  • Posts: 6959
Re: Bitbucket set to remove Mercurial support
« Reply #101 on: June 15, 2020, 10:02:15 PM »
Ok, but first I need some sleep :P. Tomorrow.

Audionut

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3617
  • Blunt and to the point
Re: Bitbucket set to remove Mercurial support
« Reply #102 on: June 16, 2020, 02:40:56 AM »
Currently at:

Code: [Select]
hg-repos-07: 23.08% downloaded, 1.96% errors, 74.96% todo
hg-repos-13: 25.35% downloaded, 1.18% errors, 73.47% todo
hg-repos-17: 24.25% downloaded, 3.39% errors, 72.36% todo
hg-repos-18: 22.46% downloaded, 0.51% errors, 77.03% todo
hg-repos-19: 15.49% downloaded, 0.27% errors, 84.24% todo
hg-repos-20: 16.87% downloaded, 0.21% errors, 82.92% todo
hg-repos-21: 20.56% downloaded, 0.26% errors, 79.18% todo

Guess I should update the scripts.

critix

  • Contributor
  • Member
  • *****
  • Posts: 156
Re: Bitbucket set to remove Mercurial support
« Reply #103 on: June 16, 2020, 06:56:40 AM »
Current:
Quote
hg-repos-05: 23.59% downloaded, 1.78% errors, 74.63% todo
hg-repos-06: 27.30% downloaded, 2.02% errors, 70.68% todo
Canon 1300D, 500D, EOS M, EOS M2

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #104 on: June 16, 2020, 06:57:56 AM »
How you check the percentage?
I'm running 17, 18, 19 since midnight.

[e] Ok, I missed the python script ;)

Code: [Select]
hg-repos-17: 14.37% downloaded, 0.43% errors, 85.20% todo
hg-repos-18: 14.57% downloaded, 0.21% errors, 85.22% todo
hg-repos-19: 10.17% downloaded, 0.14% errors, 89.69% todo
EOS R

Audionut

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3617
  • Blunt and to the point
Re: Bitbucket set to remove Mercurial support
« Reply #105 on: June 16, 2020, 09:34:30 AM »
Couldn't get ssh working here.

ssh very strict on permissions, and needs either 400 or 600 permission on the key file.

chmod no worky on symbolic linked files.

Tried all manner of windows permission settings and can only get to 444.

Tried different folders within the ubuntu home folder, but ssh-add won't stop bitching to me about file/folder permission outside of the username folder. Yet, ssh-keygen works fine ffs. Can create keys, but then can't add them.

edit: oh, and I deleted the symbolic link username folder from the home folder, added a new folder, and ubuntu still wants to link to the symbolic linked folder. Maybe a restart will fix it. Will finish downloading what is started first, then play again.

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #106 on: June 16, 2020, 11:05:05 AM »
You use WSL, right?

Code: [Select]
kitor@kitor-p70:kitor$ cat /etc/debian_version
bullseye/sid
kitor@kitor-p70:kitor$ systeminfo.exe | grep -e "^OS Version"
OS Version:                10.0.18363 N/A Build 18363
kitor@kitor-p70:kitor$ ls ~/.ssh | grep id_rsa
-rw------- 1 kitor kitor 2.6K Feb 15 17:02 id_rsa
-rw-r--r-- 1 kitor kitor  569 Feb 15 17:02 id_rsa.pub

If you did some werid stuff like symlinking your windows home dir to WSL one, then yep - this will work as you described (won't work) as you try to write linux permissions to NTFS partition over virtual filesystem.

Also, replying to your older post: https://www.magiclantern.fm/forum/index.php?topic=24420.msg228084#msg228084

Quote
The default home folder in ubuntu is located at:
Code: [Select]
C:\Users\xxxx\AppData\Local\Packages\CanonicalGroupLimited.UbuntuonWindows_79rhkp1fndgsc\LocalState\rootfs\home\username
Replace (xxxx) with with your windows 10 user. Replace (username) with whatever username you used when installing ubunutu.

I would suggest to cut the username folder and paste it wherever you want to store these downloaded repos. Then drop a symbolic link back into the original ubuntu home location.
I use a link shell extension to make that task easier.

Never, ever access WSL filesystem this way! This will mess up linux FS. Microsoft points that in documentation. MS dev blog about this
Use file explorer to access \\wsl$ with file explorer - this is the only official and safe way [1] [2]

This path that you provided is stored on Windows side using some kind of trickery to keep linux system working properly (and thus keep permissions working). If you alter it from Windows, you will experience all sort of problems, similar to ones you described.

If you want to access Windows path from Linux, use /mnt/<letter>/*. If you want to symlink it, do it using ln -s on WSL (linux symlink on linux side). If you need to access linux FS from windows - use mentioned \\wsl$ share.



[1] You can mount it as an network share by the way. So your linux FS will be visible as disk with assigned letter in explorer.
[2] https://devblogs.microsoft.com/commandline/whats-new-for-wsl-in-windows-10-version-1903/
EOS R

Audionut

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3617
  • Blunt and to the point
Re: Bitbucket set to remove Mercurial support
« Reply #107 on: June 16, 2020, 11:31:57 AM »
I need the output of the script to a different location then the WSL Ubuntu install location.

The WSL Ubuntu is installed on a 120gb SSD, which is clearly not suited to this task.

I need to be able run multiple scripts to multiple different HDD locations to be able to keep the WAN connection saturated without thrashing the HDD.

a1ex

  • Administrator
  • Hero Member
  • *****
  • Posts: 12464
Re: Bitbucket set to remove Mercurial support
« Reply #108 on: June 16, 2020, 05:57:02 PM »
Repo list completely downloaded, so we now have hg-repos-22, 23 and 24. Last one has only about 2500 repos; the interesting part is that many Mercurial repos were created in 2020, and 7 of them were created... today! They are very likely forks of existing repos. Actually, 3 of the repos created in 2020 were forks of hudson/magic-lantern, created for submitting pull requests, from Bitbucket's web interface.

Links in the big post.

I'll download another repo list, hopefully this time it will complete in a single attempt, so... at the end of the week, I should be able to cross-check the list. If I'll find anything missing (mistakes can happen), I'll add them into hg-repos-25. The others are now set in stone :)



First 3 sets appear to be complete on my side:
Code: [Select]
hg-repos-00: 91.23% downloaded, 8.77% errors, 0.00% todo, 2.82 MiB average
hg-repos-01: 92.38% downloaded, 7.62% errors, 0.00% todo, 4.16 MiB average
hg-repos-02: 92.08% downloaded, 7.92% errors, 0.00% todo, 6.81 MiB average

For the others, my current estimation of average size is biased, as in my first attempt, I've been skipping very large repos. Will report as soon as I'll trust the numbers :)

Extended status script (reporting average repo size, but slow):
Code: [Select]
from __future__ import print_function
import os, sys, subprocess, shlex

try: error_repos = open("hg-clone-errors.txt").readlines();
except: error_repos = []
error_repos = [x.strip() for x in error_repos]

def repo_size(r):
    # this is slow
    r = subprocess.check_output(shlex.split("du -b -d 0 " + r + "/.hg"))
    return int(r.split(b"\t")[0])

for i in range(30):
  fn = "hg-repos-%02d" % i
  try: repos = open(fn).readlines()
  except: continue
  repos = [r.strip() for r in repos]
  repos = list(set(repos)) # unique list (hg-repos-20 contains a few duplicates)
  downloaded = 0
  errors = 0
  total = len(repos)
  size = 0
  for line in repos:
      r = line.split(" ")[1]
      if os.path.isfile(r + ".commits"):
          size += repo_size(r)
          downloaded += 1
      elif r in error_repos:
          errors += 1
      else:
          pass
  print("%s: %.2f%% downloaded, %.2f%% errors, %.2f%% todo, %.2f MiB average" % (fn, downloaded * 100.0 / total, errors * 100.0 / total, (total - errors - downloaded) * 100.0 / total, size * 1.0 / total / 1024 / 1024))



I don't know if hg clone already does this, but would be nice to have SHA256 for each split...

Been looking into this as well, but the exact contents of the ".hg" directory seem to be different among fresh clones. One obvious difference is the source URL, which is different if you clone with HTTPS or with SSH. Didn't investigate much, but... even the contents of .hg/store appear to be slightly different. Not sure why exactly.

Checksums for the repo lists, to verify the downloads:
Code: [Select]
# sha256sum all-repos hg-repos*
e68bf18a3433ba921443616e7e69486c25d48f132071a8cb319a5d1212b1a330  all-repos
e1bbf56016d6f958a1539ab5de14c0c8536aa14376ac9431d92e6a4e982032ed  hg-repos
66f9e6fa4fbff1af0c8f6190ae60e285534e334ec02f2b3bbebe28e9651e5580  hg-repos-00
0db369328d1b479d393d8fdd58c9b316c8ef856a71e1cfd749599dc1c9592144  hg-repos-01
761c2c2d39529294fa1aa9e26c6dbb4d2732b6fdb5704b4936f5e7d52dd46a87  hg-repos-02
ae7ef228fb60e750334176318bc672e2783e715108573bb17c3aafa12c786e2f  hg-repos-03
66bbc945f3d885953efcda6a4af2b22a2530e0bdf29bd462212ed45c159123a4  hg-repos-04
cc7deeda69e5f4a1360f129616f1bfe8cb96b63209850a62b4fad391ad2e354e  hg-repos-05
07b3bb8809dedb4c5c97034353393b5f28db7d08e33d1abb0f2a1363121c676f  hg-repos-06
c20596b54a00d9951c2194172eca4f0f32c78a8c44d70919d5b28ab34bcac798  hg-repos-07
9fde67105c497a14ce997c0ca87cba2eba86a9b894372ae89393f4436c6da792  hg-repos-08
fff343b16fa879eef3a3c5f9719c3e5a5de8098af090cfc231124070e1f20190  hg-repos-09
bfff93447f3ec036fcb15e1bebfed97967fb3a84ca8214e44f77b3eb27a696e3  hg-repos-10
7477e4caa1163335c33f0aa7ab711fc9614298aca39158dd6fbf59fd9ae89a8d  hg-repos-11
16d60a7fe0e644143b92df13abcae7af813cdf0ba8d42359da4ef98ca2ab9400  hg-repos-12
d623425b7d6f250b06c992d705f6561b7f8489337968dafe88d297e44d70ff22  hg-repos-13
b6b97a954028972fa06469dc7ce7b44be0bf90c086b5316295a4d80b39057b8c  hg-repos-14
84ab787f19a4d6463c3f343da73d5824dc4acef7cbdd7670a73f6e4be0bb90e3  hg-repos-15
138c982985f812db853903c2dba4f6dfe8c16e441e9d4b801d08daf45a15870b  hg-repos-16
520cdbec2fabb54e09cb66b5572b654352fbc1ff46625ad0a23b446dd61a547b  hg-repos-17
ce4cf07804864d6832976c36521b502ec7e162057449d076c87de5191ad9e180  hg-repos-18
3a8353098236fbf7cfceacd2121af0e44d5b9b46734e81ef38f7e4f5bdeb2848  hg-repos-19
1b6255b7aec596d2b2931589192b72daacd52bc9c6198f25b9cd4cdf6e04b4d1  hg-repos-20
c230a100849e1d85a243dd512aba0debf996ca8e27af7ad351f6754447621e10  hg-repos-21
de76f1989577991f40f6addd2d67e62d5f66048fcb47d86b9dab6444f835bcbe  hg-repos-22
661da278c57fef58306e6144f2dde7f2e0c09213fa3a47e9c6245864c68f08c8  hg-repos-23
6dbffbb608179da20d4f4c0dac8252ca9eba23657dcc6eac13288a4d3b6ff6e5  hg-repos-24

# md5sum all-repos hg-repos*
73f16c6e7d8ec44a22d35748505f4486  all-repos
a56d079deadb5689e199cc8bb9112c9a  hg-repos
9b7074c0b8fa74078b977778b0aa2f51  hg-repos-00
648ae2d23a6b444423c1e7a429fffbc5  hg-repos-01
f1e72fb46f4ea57558f4e47330d663f5  hg-repos-02
ff7ea3f6b8ea07247e51b365487c4ec5  hg-repos-03
aebca1355249198916050d001dabac86  hg-repos-04
7c221e98539145e187d13543e0ccb447  hg-repos-05
ab5533cca9c6501d352bd2d981cfbe01  hg-repos-06
a76b118a375c095d5b2be91c7c846e96  hg-repos-07
19d36f29bfc12a0aa72b6f7ded194ba6  hg-repos-08
e6dffbab4d1b717eb1177ccb1133a55a  hg-repos-09
884c5c49a4880b688a5d8f624dfe2188  hg-repos-10
10cf61d018db3551c4087b4e671b80b6  hg-repos-11
7cfeba30d3f9462aa5c327fd8ba12a5c  hg-repos-12
c481fbad58b375b1c316e1c5493b0924  hg-repos-13
079e724a6ce23a69c0a6c0d1bc8c27a3  hg-repos-14
8763470dcbec370d062b40aea2a2dafb  hg-repos-15
b365070a0066ea15aee087c77f848d6c  hg-repos-16
1f45337c6efc5041e21b7d6f85bb1b5c  hg-repos-17
aab6d956adf417ca369691d0858c6617  hg-repos-18
229ccf672fd9d77178ffce692a65a7f8  hg-repos-19
7052b2669a69f7d8b9229d58e5cc1975  hg-repos-20
a320bb90fc163855909d4893e395c48f  hg-repos-21
265a5ac75e355fa27e45aefd23ce0dc1  hg-repos-22
89c1b7070d0a91aab8ff6a53c440199f  hg-repos-23
8ebeb53282679c8f3f5c060f2c584451  hg-repos-24

Currently, for each repo, the download script creates a list of commit hashes. These depend on the "contents" (commit body, message, timestamp etc) and on the parrent commit(s) (details), so as long as the .hg directory is not messed up, I'd say it should be OK for integrity checks. When collecting the downloaded files from all participants, I can re-create the lists of commits and compare them, and I can also run "hg verify" on each repo.



Also noticed the .hg/store/data directory has many file names resembling the ones in the source code, and got a crazy idea: what if we group similar or identical files together? Would the compressor find repeated patterns easier? This idea was previously explored, and apparently has some merit.

On the previous example of hudson/magic-lantern and all of its forks, this command compressed the entire thing to 157 MiB (without sorting: 273.3 MiB, uncompressed size 16.4 GiB, individually compressed repos 14.3 GiB):
Code: [Select]
find -type f -path '*/*/.hg/*' | rev | sort | rev | tar -cvf - -T - | xz -9e --lzma2=dict=1536Mi -c - > ml-repos.tar.xz

It sorts the file list by the reversed-string file path, effectively grouping files with the same name together (without looking at the contents). TODO: try the approaches from the "morimori" blog post and compare various commands.

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #109 on: June 16, 2020, 06:30:39 PM »
Also noticed the .hg/store/data directory has many file names resembling the ones in the source code, and got a crazy idea: what if we group similar or identical files together? Would the compressor find repeated patterns easier?

IIRC not possible (at least easily) on classic "on the fly" compressors like gzip. But afair squashfs does block-level deduplication. Thus grouping forked repos and squashing them in one file may be the way to go. mksquashfs / unsquashfs should be available in every distro...

[e @ 11PM] To not spam. I added 22-24:

Code: [Select]
hg-repos-17: 35.56% downloaded, 0.99% errors, 63.45% todo
hg-repos-18: 34.18% downloaded, 0.47% errors, 65.35% todo
hg-repos-19: 27.13% downloaded, 0.27% errors, 72.60% todo
hg-repos-22: 0.14% downloaded, 0.00% errors, 99.86% todo
hg-repos-23: 0.02% downloaded, 0.00% errors, 99.98% todo
hg-repos-24: 0.24% downloaded, 0.00% errors, 99.76% todo
So far it took about ~500 gigs.

[e @ 11:30PM] And just added 20 and 21 on secondary location:
Code: [Select]
hg-repos-20: 0.06% downloaded, 0.00% errors, 99.94% todo
hg-repos-21: 0.39% downloaded, 0.00% errors, 99.61% todo
EOS R

critix

  • Contributor
  • Member
  • *****
  • Posts: 156
Re: Bitbucket set to remove Mercurial support
« Reply #110 on: June 16, 2020, 06:39:26 PM »
Current
- with new script :
hg-repos-05: 62.37% downloaded, 4.42% errors, 33.21% todo, 7.55 MiB average
hg-repos-06: 64.55% downloaded, 5.49% errors, 29.96% todo, 7.93 MiB average


- with old script:
hg-repos-05: 62.16% downloaded, 4.42% errors, 33.42% todo
hg-repos-06: 63.70% downloaded, 5.49% errors, 30.81% todo
Canon 1300D, 500D, EOS M, EOS M2

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #111 on: June 17, 2020, 07:21:56 AM »
Script has a problem (this failed on repos 19)

Code: [Select]
Processing -1/objectlistview ...
hg clone: option -1 not recognized
hg clone [OPTION]... SOURCE [DEST]

make a copy of an existing repository

options ([+] can be repeated):

This is not the only example of username starting with "-". Simple wrapping variables in quotes doesn't work, those would need to be properly escaped.

Status:
Primary:
Code: [Select]
hg-repos-17: 43.58% downloaded, 1.34% errors, 55.08% todo
hg-repos-18: 38.51% downloaded, 0.57% errors, 60.92% todo
hg-repos-19: 29.07% downloaded, 0.34% errors, 70.59% todo
hg-repos-22: 4.22% downloaded, 0.10% errors, 95.68% todo
hg-repos-23: 2.83% downloaded, 0.23% errors, 96.94% todo
hg-repos-24: 9.56% downloaded, 0.31% errors, 90.12% todo
I'm a little worried about 22 and 23 lagging behind :) 19 slowed due to an error above (hg decided to show interactive help via less...)
I'm thinking about moving 2nd half of 17-19 to secondary location to leave bandwidth free for 22-24. I had to limit it to 60mbit/s to keep everything working and it's saturated all the time ;)

secondary:
Code: [Select]
hg-repos-20: 8.14% downloaded, 0.12% errors, 91.74% todo
hg-repos-21: 11.43% downloaded, 0.08% errors, 88.49% todo
EOS R

a1ex

  • Administrator
  • Hero Member
  • *****
  • Posts: 12464
Re: Bitbucket set to remove Mercurial support
« Reply #112 on: June 17, 2020, 08:27:47 AM »
The failing command was:
Code: [Select]
hg clone --config ui.interactive=false -U ssh://hg@bitbucket.org/-1/objectlistview -1/objectlistview

So, plain "hg clone" of that repo worked, but specifying a destination that started with a dash... caused trouble, as hg tried to interpret it as an option. Many tools accept "--" as "end of options", and apparently it's working here as well:

Code: [Select]
hg clone --config ui.interactive=false -U -- ssh://hg@bitbucket.org/-1/objectlistview -1/objectlistview

Good catch.

Updated the script. Found such repos in 13, 15, 16, 17, 19, 20, 21, 22, 23.

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #113 on: June 17, 2020, 09:15:27 AM »
Quote
started with a dash... caused trouble

Still it will be fun to work with those later  ;) (my favorite bash test on newcomers is to create " -rf" directory in / and ask them to remove it from root account)
EOS R

Danne

  • Contributor
  • Hero Member
  • *****
  • Posts: 6959
Re: Bitbucket set to remove Mercurial support
« Reply #114 on: June 17, 2020, 09:41:19 AM »
Still downloading package 7 - 10. Ran the original script again on package 8 cause it missed a few repos it seems but should be done soon, probably today. Will run the latest script version on all packages once finished.

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #115 on: June 17, 2020, 09:48:55 AM »
I also have some of those:

Code: [Select]
Processing JannisKhammas/railbound-frontiers ...
applying clone bundle from https://api.media.atlassian.com/file/blahblahblah
adding changesets
adding manifests
adding file changes
transaction abort!
rollback completed
abort: stream ended unexpectedly  (got 196 bytes, expected 4096)
I wonder if restarting script will get them?
EOS R

a1ex

  • Administrator
  • Hero Member
  • *****
  • Posts: 12464
Re: Bitbucket set to remove Mercurial support
« Reply #116 on: June 17, 2020, 10:21:11 AM »
Very likely. Or, once it finishes, run the script again 2-3 times (it will retry the failed ones).

Danne

  • Contributor
  • Hero Member
  • *****
  • Posts: 6959
Re: Bitbucket set to remove Mercurial support
« Reply #117 on: June 17, 2020, 10:36:24 AM »
repo 8 downloaded and reran latest script and all was done and good. So, this is the compression algo to use?
Code: [Select]
find -type f -path '*/*/.hg/*' | rev | sort | rev | tar -cvf - -T - | xz -9e --lzma2=dict=1536Mi -c - > repo_8.tar.xzGuess once I start it will begin to chew for a few hours?

Checksum list, hm, howto perform...will check.

Audionut

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3617
  • Blunt and to the point
Re: Bitbucket set to remove Mercurial support
« Reply #118 on: June 18, 2020, 02:54:41 AM »
What is the best script command to delete duplicate material?

For instance I have some hg-repos-20 on a HDD from when I was initially downloading everything to a single drive. I have since begun downloading hg-repos-20 to another drive, and I would like to delete any fragments of hg-repos-20 from the initial drive.

Can someone confirm any compression being used before I begin uploading to a hosting server.

critix

  • Contributor
  • Member
  • *****
  • Posts: 156
Re: Bitbucket set to remove Mercurial support
« Reply #119 on: June 18, 2020, 07:50:05 AM »
Current:
With old script:
hg-repos-05: 93.17% downloaded, 6.83% errors, 0.00% todo
hg-repos-06: 92.08% downloaded, 7.92% errors, 0.00% todo

hg-repos-05: 93.17% downloaded, 6.83% errors, 0.00% todo, 11.03 MiB average
hg-repos-06: 92.41% downloaded, 7.59% errors, 0.00% todo, 11.24 MiB average

I'm running the scripts again
Canon 1300D, 500D, EOS M, EOS M2

a1ex

  • Administrator
  • Hero Member
  • *****
  • Posts: 12464
Re: Bitbucket set to remove Mercurial support
« Reply #120 on: June 18, 2020, 08:07:18 AM »
For instance I have some hg-repos-20 on a HDD from when I was initially downloading everything to a single drive. I have since begun downloading hg-repos-20 to another drive, and I would like to delete any fragments of hg-repos-20 from the initial drive.

I've used this to move the stuff belonging to one "hg-repos" set, into another directory.

Code: [Select]
# script must be run from the working directory (where things were downloaded)
# usage: move-out.sh 05      # only include the number of the repo list
LST=hg-repos-$1              # must be in the working directory
DST=../bitbucket-repos-$1    # must be outside the working directory, sans trailing slash

for f in $(cat $LST | cut -d ' ' -f 2 ); do
  echo $f
  if [ -f $f.commits ]; then
    mkdir -p $DST/$f
    mv $f/.hg $DST/$f/
    mv $f.commits $DST/$f.commits
  fi
done

Can someone confirm any compression being used before I begin uploading to a hosting server.

Yes, .hg directories contain mostly compressed data. Unless you group redundant things together (e.g. forks) and create a solid archive, you will not get noticeable reduction in archive size; see my analysis on the hudson/magiclantern repo and all of its forks.

repo 8 downloaded and reran latest script and all was done and good. So, this is the compression algo to use?
Code: [Select]
find -type f -path '*/*/.hg/*' | rev | sort | rev | tar -cvf - -T - | xz -9e --lzma2=dict=1536Mi -c - > repo_8.tar.xzGuess once I start it will begin to chew for a few hours?

This appears to work fine, but it's very slow. My attempt of trying this command ran for over 7 hours for the first 4 sets (00-03, about 200 GiB) and ran out of disk space :D

Code: [Select]
hg-repos-00: 91.23% downloaded, 0.00% errors, 8.77% todo, 2.82 MiB average
hg-repos-01: 92.38% downloaded, 0.00% errors, 7.62% todo, 4.16 MiB average
hg-repos-02: 92.08% downloaded, 0.00% errors, 7.92% todo, 6.81 MiB average
hg-repos-03: 93.59% downloaded, 0.00% errors, 6.41% todo, 8.38 MiB average

On the good side, "find" does include files starting with dot (.) or dash (-) without any fuss.

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #121 on: June 18, 2020, 08:35:52 AM »
Status update [e: 12:00 CEST]:

Primary:
Code: [Select]
hg-repos-17: 72.14% downloaded, 2.49% errors, 25.37% todo
hg-repos-18: 57.80% downloaded, 1.10% errors, 41.10% todo
hg-repos-19: 42.02% downloaded, 0.47% errors, 57.51% todo
hg-repos-22: 18.49% downloaded, 0.26% errors, 81.25% todo
hg-repos-23: 15.17% downloaded, 0.41% errors, 84.42% todo
hg-repos-24: 39.75% downloaded, 0.75% errors, 59.50% todo
Primary is still running without "--" fix, I'll rerun it later.

Secondary:
Code: [Select]
hg-repos-20: 56.82% downloaded, 1.34% errors, 41.84% todo
hg-repos-21: 99.87% downloaded, 0.13% errors, 0.00% todo
hg-repos-22: 0.03% downloaded, 0.00% errors, 99.97% todo
hg-repos-23: 0.02% downloaded, 0.00% errors, 99.98% todo
20 just restarted with "--" fix, 21-23 run on fixed scripts.

@alex - do you have any method to group forks together? I'd like to try squashing them, as I proposed a few posts earlier.
EOS R

Audionut

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3617
  • Blunt and to the point
Re: Bitbucket set to remove Mercurial support
« Reply #122 on: June 18, 2020, 10:58:06 AM »
Code: [Select]
hg-repos-05: 0.41% downloaded, 0.01% errors, 99.58% todo
hg-repos-07: 92.83% downloaded, 7.17% errors, 0.00% todo
hg-repos-13: 52.67% downloaded, 2.44% errors, 44.89% todo
hg-repos-17: 68.06% downloaded, 2.16% errors, 29.78% todo
hg-repos-18: 64.46% downloaded, 1.10% errors, 34.44% todo
hg-repos-19: 20.84% downloaded, 0.20% errors, 78.96% todo
hg-repos-20: 59.69% downloaded, 0.49% errors, 39.82% todo
hg-repos-21: 99.81% downloaded, 0.19% errors, 0.00% todo
hg-repos-24: 0.08% downloaded, 0.04% errors, 99.88% todo

Been some stopping and starting as I've needed to rearrange data (and trying to get SSH working yesterday). Advantage is the repos are being checked multiple times (for data already downloaded).
Found some space on another HDD to get 2 more repos going (05/24).
As soon as 07 is finished I can begin to upload 07 & 21 and start another 2 repos. edit:Done and uploading

I can add a user to my gsuite account (for a month or 2) if a central host is appealing and needed. Not sure if google will like multiple IP's uploading on the one user account, but I guess there's only one way to find out. There's a documented 1TB space per user, but in practice it's unlimited.

edit: 37 Mbps (actual) upload = 15GB / hour. This will take some time.

kitor

  • Contributor
  • Member
  • *****
  • Posts: 193
Re: Bitbucket set to remove Mercurial support
« Reply #123 on: June 20, 2020, 11:01:53 AM »
Since they were no updates lately:

Code: [Select]
hg-repos-17: 96.80% downloaded, 3.20% errors, 0.00% todo
hg-repos-18: 91.07% downloaded, 1.72% errors, 7.21% todo
hg-repos-19: 72.30% downloaded, 0.69% errors, 27.01% todo
hg-repos-22: 40.85% downloaded, 0.48% errors, 58.67% todo
hg-repos-23: 44.74% downloaded, 0.68% errors, 54.58% todo
hg-repos-24: 94.37% downloaded, 5.63% errors, 0.00% todo
17, 24 already run twice. ~1.5TB at the moment.

Code: [Select]
hg-repos-20: 99.31% downloaded, 0.69% errors, 0.00% todo
hg-repos-21: 99.87% downloaded, 0.13% errors, 0.00% todo
hg-repos-22: 43.59% downloaded, 0.24% errors, 56.17% todo
hg-repos-23: 40.28% downloaded, 0.34% errors, 59.38% todo
20,21 already run twice. This one is 850GB at the moment.

Just note that they overlap as I'm downloading two copies of 22 and 23 in two locations. And they seems to be much bigger that surrounding ones ;)
EOS R

Audionut

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3617
  • Blunt and to the point
Re: Bitbucket set to remove Mercurial support
« Reply #124 on: June 20, 2020, 01:47:35 PM »
i'm here:

Code: [Select]
hg-repos-05: 37.86% downloaded, 3.05% errors, 59.09% todo
hg-repos-07: 92.83% downloaded, 7.17% errors, 0.00% todo
hg-repos-13: 94.49% downloaded, 5.51% errors, 0.00% todo
hg-repos-17: 97.09% downloaded, 2.91% errors, 0.00% todo
hg-repos-18: 67.72% downloaded, 1.16% errors, 31.12% todo
hg-repos-19: 58.56% downloaded, 0.46% errors, 40.98% todo
hg-repos-20: 99.33% downloaded, 0.67% errors, 0.00% todo
hg-repos-21: 99.81% downloaded, 0.19% errors, 0.00% todo
hg-repos-24: 47.42% downloaded, 0.71% errors, 51.87% todo

My issue atm is this:


Don't be fooled by the number of items remaining, windows is still counting the total number of items.

I tried zipping the folder up, got 18 hrs in (50%) and the linked online drive disconnected.  ::)
Google file sync opens a good number of connections, but given the small size of most of the files the upload speed is still limited.
I need to find some way to select the content of the top folder and individually compress each folder within the top folder. That will give me some redundancy from a crash, and increase the size of the files being uploaded which will increase the overall upload speed. Otherwise I'm going to be limited to how many more repo sets I can get in the next 10 days.

I'll be able to finish what I've got going, but otherwise I'm effectively out of HDD space until data is uploaded.