Principal Investigator: Tin Hang (Henry) Hung [email protected]

Project Progress

ARC Quick Note


Your home space: /home/oxfd2652

Your project space: /data/biol-cnvatlas/oxfd2652

Why are we doing this?


Plants can have extreme copy numbers of >300, but in humans 10 copies are already considered extreme. If copy number follows a (quasi-) Poisson distribution, then its variation should be substantial in plants.

Stage 1: Curating the database and meta data


<aside> 🏁 Goal 1: Obtain >1,000 genomes, with their associated sequences and meta data, from NCBI

</aside>

The essence of a meta-analysis is unbiased, standardised sampling. We should aim at using NCBI Datasets (which is quite nasty right now because of all the recent changes in the system…) See this, this, to obtain genomes that match these criteria: