How to organize a PhD when buried under a mountain of data

I will preface this by saying I am not an organized person, if you need proof just look below at the picture of my desk.

Research projects are inevitable in life: their topics range from planning a trip or event to writing a PhD. At least for me, one of the hardest things about researching things and doing research projects is staying organized. But more on that later.

My desk

What is data and how can it become a mountain? Data is defined by the Oxford English Dictionary as “Facts and statistics collected together for reference or analysis.” Nowadays data infiltrated every aspect of our lives. One of the primary tasks during my PhD has been to identify how microorganisms use basaltic rock as a substrate. To do this I have collected tomography data at a variety of scales (producing data sets which can resolve features that are tens of micrometers to other data sets that can be used to observe features which are larger than 500 nanometers). Now that it is collected I have to analyse it all. As Pavel has mentioned in earlier posts tomography datasets are thousands of individual files that together can be used to create a 3D rendering of the object that was scanned.

It is because of this that I have ended up with a mountain of data to climb. The computer on my desk in the image above has 8 TB of storage. Next to my desk is a server which has a capacity of ~65 TB and scattered around my office and apartment are more than 15 portable hard drives, each with a capacity of at least 3 TB. At last look, I have over 40 TB of primary data, all of which must be stored in duplicate, most of these data will balloon to 3 times their original files sizes during the analysis process.

Datasets of this size are nothing new, and an entire field, Big Data, is dedicated to figuring out how to analyse, store, and manage such data sets. Organizing and managing these kinds of data is not very different than organizing any data or primary research you might conduct during a PhD project, MSc project or everyday life. The only difference here is magnitude.

I started my PhD over 2.5 years ago, and I went in naively thinking that setting up some folders to save things in an organized fashion would be enough. Little did I know that I would end up with so much data and ultimately, I have had to devise a system of managing it all on the fly. I would not recommend that. It makes things very confusing and rather unhelpful.

When managing personal datasets and personal research there is no best method so to speak. The best organization system is one that gets used and one that works for an individual. Note: this is not true for widely used datasets where versioning, a robust naming method, and consistent organization is key. That said, there are a few things that I have found make life much easier. Choose a method and stick with it. For example, if you start with putting the date in every file name so you know when the file was originally created then you should continue with that.

Personally, for everyday work and everyday analyses I have a panoply of folders that are split up into categories as you can see in the image below. I also store everything in a paid dropbox account (not an advertisement, I just love the service) so that all the files are automatically stored in the cloud as well, and very basic versioning is performed. This works passably well for me, but may not work for everyone.

File organization tree

So why does this matter for anyone who is not doing a big academic research project? Everyone has research projects, even if they do not necessarily think of them in that way. Where do I want to go on vacation? Where do I want to host a party? What is the best restaurant in my price range in my city? These are all questions which can be researched in everyday life. There are many ways to do so, a fair number of people like take the approach of flying by the seat of their pants, others will create detailed dossiers of their options. Those who take a lackadaisical approach may have once found the perfect restaurant, but cannot remember where it was or how they found it. They then end up not being able to return (I do this all the time). Alternatively, some may compile documents with tens of vacation options only to decide that they are not going this year. Finding a method of organizing files, data, etc that works for you can streamline your entire research process. I know it certainly worked that way for me.

Surprise: Catching bugs in rocks!

X-ray tomography is a powerful technique that allows us to see very tiny details inside a rock. However, the image acquisition is usually just a starting point for the image analysis. In order to get quantifiable information, one has to develop specific image processing algorithms. In the porous medium research, one of the most important processing step is the development of the task oriented image segmentation algorithm.

While trying our segmentation algorithm on a 3D image of a sedimentary rock, we found some curious piece of a former life! The “worm” you can see in the video is an orthoceras — an ancient mollusk that is often found in sediments.

This carbonate rock has been cored in the Miocene carbonate platform of Llucmajor, in Majorca. The rock has suffered a re-equilibration from aragonite to calcite (dissolution of aragonite and crystallization of calcite). This reaction led to the formation of porosity (grey parts of the picture). In this case the spatial distribution of the pores has been controlled  by the pre-existing structure of the rock. This process allowed the preservation of the shape of the fossil, even after re-equilibration and recrystallization into calcite. That is why we can see the orthoceras, although its skeleton has undergone chemical alteration.

Figure 1. This video is a series of 2D slices of a 3D volume. No orthoceras is actually swimming here

Continue reading

Doing a PhD – What is scientific research like?

When thinking about your career prospects you may wonder what it would be like to stay at university and go on to complete a PhD with the aim of working in scientific research after that. You may ask yourself what type of struggles you will encounter, or how different it is from working in a company. Or you may just wonder how different it is from undergraduate and master’s level Science studies. Is it for you? Let’s find out.

PhD vs. BSc or MSc

In comparison to studying a degree at bachelor’s level, a PhD will focus on a very specific topic in high detail and at high level, while during your bachelor’s degree you will have covered a very wide range of topics more superficially and written a thesis that is more descriptive or helpful in learning methods and concepts than in advancing science. In comparison to master’s level, it depends. In research-oriented master’s degrees, the master’s thesis or dissertation will be a first taste of what research is actually like, however at a smaller scale. On the other hand, industry-oriented master’s degrees will be more relevant to the interests of a company or industry sector, and may therefore require skills that are more suited to that particular field of industry and applied science.

Continue reading

The abyssal crazy world

As part of the “Abyss” group, we might wonder what things look like down there, in the deep ocean. As you probably already experienced, diving in water comes with (uncomfortable) changes of temperature and pressure. And that’s only a few meters! The conditions keep changing going deeper in the water column (more than freezing toes!). In oceans, the abyssal waters represent the part lying between 2000 m and 6000 m under sea level. At these depths, the temperature is constant around 0-4°C, the pressure is up to 200-600 atmospheres, and there is no light. And light is not only useful to see around but it is also the energy for photosynthesis and hence life sustenance at the surface of the Earth. Yet, although very poorly known, these depths allow life to exist. And what comes out of discoveries is sometimes very interesting or unexpected!

Life has to adapt to these difficult conditions of low temperature, high pressure, absence of light and scarcity of nutrients. The result is not exactly what we are used to, evolution sometimes leads to cool physical and morphological features! Let’s have a look at some inhabitants of the abysses.

Beyond 100 m in the dark cold water, plants disappear, life in the deep sea is 100% animal, likely because photosynthesis is impossible. With disappearance of light at depth, numerous species evolved to be blind or, conversely, grew big, globular eyes in the attempt to catch any remaining light like our very cute friend in Figure 1.

Figure 1: The extreme growth of the eyes allows the capture of every bit of light!

Continue reading

Life as a (semi-) nomadic early career scientist

One of the great things about being a geoscientist is that travel is often an integral part of your research and work. Geoscientists work in the field, we go to conferences and short courses all over the world, and some of us even move countries for our jobs. This often means being thrown head first into a new country and culture. An early career scientist (ECS) is someone who is very early into their scientific career, for example all of the regular authors at SeaRocks blog. While the exact definition of who qualifies as an ECS varies there is nearly always one consistency: an ESC’s life, such as mine, is often filled with uncertainty of what, and where, is next. A PhD is a fixed term contract. There are no guarantees that your next position, be it a post-doc, or job in industry, will be in your current city, or even on the same continent. Continue reading

Chimneys and smoke at the bottom of the sea: signs of a fire below!

One of the most spectacular features of geology (at least to me) are black smokers. They consist of hollow conduits where hot, nearly boiling fluids exit the oceanic crust. These fluids can be white or black since they carry particles from underlying rocks. The smoke may contain valuable metals such as nickel, arsenic, copper, silver and even gold! On top of this, in the conduits or next to the chimneys, the most amazing and strange life forms can be found. All the more reason to explore what is going on there, and where we could expect systems like these to occur! Continue reading

What are microbes, and wait, they are found in rocks?

A very (in)famous mathematician, Dr. Ian Malcom, Jurassic Park, once said five very insightful and philosophical words. “Life, uh…finds a way

Dr Ian Malcom in Jurassic Park

While he was referring to breeding dinosaurs, which films have taught us is not a good idea, he was nevertheless correct in a different context. Life very often finds a way to exist in unexpected places and ways, and often that life is microorganisms. Continue reading