IT Enables Research – Plant Science Salt Lab

27 October, 2020

Scientists should be doing science. Sometimes they also need to address technology. We all know that technology can be very scary. Check out this scenario. R code hosted on GitHub. This code becomes a website, thanks to Shiny package. The scientist wants to host this code on a KAUST virtual machine and make it accessible to the world. She also wants the virtual machine hosted in KAUST to auto-magically pick up every change made to GitHub so all users in the world could see these changes at once. Read on to learn how IT Research Computing, through voodoo, made this dream come true.

We met recently with Prof. Magdalena (Magda) Julkowska, an Assistant Professor at Boyce Thompson Institute, former Research Scientist in Prof. Mark Tester’s Salt Lab in the BESE division here at KAUST. Prof. Magda reached out to IT Research Computing for guidance on creating a website that KAUST publishes but gets its contents from a GitHub repository.

Could you explain what Salt Lab does?

I was working for Prof. Mark Tester. We were looking at how stress affects plant growth or how plants react to stress. We used molecular tools to make transgenic plants, examining the effects of individual genes, and we were also doing high throughput phenotyping. We have many large populations of plants that we were screening. These are called natural diversity panel. These large populations of plants come from distinct parts of the world. The assumption is that if you have a plant coming from a site next to the sea, that plant might be more salt stressed compared to a plant that is coming from somewhere in the mountains where it is raining quite a lot and it has abundant fresh water. We are screening these populations for responses to stress. Then we combine this information with genetic markers to find underlying genes using genome wide association study. This process generates enormous amounts of data, which can be explored in multiple ways.

Where were you storing all this data?

We started out by storing it on our local computers. We ran out of storage quickly. That is when we transferred all our data to DataWaha (author’s note: DataWaha is free for the first 10TB of data; you can buy extra storage to expand your quota.)

Would you process data directly from DataWaha?

No, we would process data on our computers. We would store our raw data as it came off the machine onto DataWaha. Once we have processed our data on our computer, then we would also store the results there. Both data sets are valuable to us. We stored the processed results because these are what we use in our publications. It is also important to store the pipelines that generated the results, so we have data reproducibility. That is what brought us to create MVApp.

What were you doing in Salt Lab?

While working for Prof. Tester, I was looking at the root to shoot ratio and changes in there caused by salt stress. We know that this ratio changes under drought, but we did not know how salt stress changes this ratio. I was also helping the students that were working on the high throughput phenotyping datasets in tomato, Arabidopsis, and other species. Since I became quite proficient with R language (author’s note: R is a programming language and free software environment for statistical computing and graphics), people would come to me asking for help. This also led to designing MVApp so researchers could use R without my help.

How does your work at KAUST progress into your job at Boyce Thompson Institute?

In KAUST I learned about high throughput phenotyping, the data analysis part of it and setting up the experiments. I am planning to set up a high throughput phenotyping facility in Boyce Thompson Institute. We are going to use a population of wild tomatoes developed in KAUST; Prof. Tester allowed me to take the seeds and build further resources for this population. We already developed the genetic markers in KAUST. I am going to screen them for changes in plant architecture. Prof. Tester already screened them for salinity stress. Phenotyping usually looks at the plant as one big green blob. The green blob grows as the plant grows. You add stress and the blob grows a little bit slower. I am very interested in dividing that blob into the individual components of plant architecture. You have branches, leaves, new branches, flowers, and things like that. How are these components affected by stress? I will be moving away from salt stress because in the USA fresh water is so abundant that it might be difficult to get funding for it. We will be focusing on drought and heat, which are another two major stress factors predicted to increase with the climate change.

You just told us that you brought data from KAUST to Boyce Thompson Institute. Do you foresee still collaborating with KAUST in the future?

Yes, the collaboration with Prof. Tester is still going on. Hopefully, I will be back to visit KAUST and present the new results sometime in the future!

What problem were you trying to solve?

All of us working in high throughput phenotyping struggle with the amount of data generated. We are scoring, on average, thirty different phenotypes. That is thirty different measurements (e.g. plant height, plant size, etc.) per plant per individual time point. We usually have two different conditions: salt and control. We have seven different time points. This quickly adds up to hundreds of data points per individual plant; without mentioning that we compare each plant with hundreds of different other plants. All that data overwhelmed us. So that is when we decided to work on the application. R has this nice package called Shiny. This package allows to transform your R code into a clickable website. We thought that was great because we could solve many researchers pain points with a website.

What challenges did you face in solving your problem?

Accessibility. We would ask researchers to try it out. They had to download our code from my GitHub repository. The researchers would also have to type something into R. That is when we would lose them; this was a big hurdle for them since they “had to code” and for us since we could not easily share our code. 

We were trying to make our website available so researchers would not have to touch code at all. They did not have to go to GitHub of which they were scared; heck it scares me at times!

I can imagine all these biologists having issues with code. Being a biologist does not mean you also have to be a data scientist. We wanted to help them. We were struggling with it and we have a good understanding of R so imagine how “pure” biologists were struggling with it.

How did IT Research Computing help you?

IT Research Computing helped publish the MVApp website, https://mvapp.kaust.edu.sa. Now we have a domain for our website, plus the virtual machine hosting the website would pick up changes on my GitHub repo. Your team would also patch this virtual machine for security updates once a week to avoid security breaches.

Later, the website was containerized so it could run on IT Research Computing’s production Kubernetes cluster. This container business made our website more stable and easier to deploy. I had a couple of meetings with people (author’s note: our great DevOps engineers Asmaa Hassan & Nasr Hassanein) even though I did not understand what this container business was. They were always patient with me explaining over and over the benefits of moving the website to Kubernetes. Especially because I did not have to build the container myself since I do not know how to do that.

Can you tell us what comes next?

Once we published MVApp, we had some workshops in KAUST. Many researchers liked it. We had spontaneous contributions from the statistics department. It is nice that people want to contribute time and effort to your work. I also had many positive emails from around the world. Although I am not in KAUST anymore, I would love to continue working on it. I think apps like MVApp are valuable for people that have limited time to learn new things, especially with the technology landscape changing so fast. The containers are going to be a winning solution since it keeps things stable and allows people to run MVapp easily on their computers.

Prof. Magda do you have any closing remarks?

It was amazing to work with IT Research Computing. When I arrived here in Boyce Thompson Institute, I started talking about MVapp and my experience publishing it to the world thanks to IT Research Computing’s help. The people from Boyce Thompson Institute looked at me and said “Oops, we do not have those kinds of services here”. This tells you that you should not take things for granted. I really appreciate that I had an opportunity given to me by KAUST to work with IT Research Computing. A team of experts knowing what they were doing and how to help me; I am just a small biologist that magically wants her code to be accessible to anybody in the world. I recall the first time I met with your team and you had no idea what Shiny was; now it is inside a container running on a Kubernetes cluster.

Dr. Magda Julkowska (in photo) has joined KAUST from the Netherlands, although she originally germinated in Poland. Dr. Magda transplanted herself from KAUST to the USA during the COVID-19 pandemic, which resulted in added stress. Dr. Magda now can empathize with the plants that she routinely transplants for her research.

GitOps@KAUST

IT Research Computing helped Prof. Magda by applying its GitOps knowledge and expertise to build a GitLab pipeline that mirrors her code in GitHub, checks that the code is syntactically correct, builds a Docker image, allowing it to be deployed anywhere (as long as you have a Docker runtime installed), and finally deploy it to her customers. This pipeline makes Prof. Magda independent. She can push fixes and new features to her customers all on her own without any further help from IT Research Computing. 

Creating a Shiny web site is just a matter of clicks away using a GitLab template created by IT Research Computing for its customers.

CONTACT US

KAUST Information Technology Department

it.kaust.edu.sa

We make IT happen!