Author Lawrence Collins
A step-by-step guide on how to analyse flex data via the use of a bespoke Python package. This notebook contains all the information on how to get started using the software; with further details contained in the documentation.
If you are new to the program I would recommend going through this notebook and executing each cell containing the code one at a time. Have a play around with the functions to get a feel for how things work. All of the code is hidden behind the scenes leaving the user with a selection of simple commands that carry out highly customisable operations on their assay data.
The Jupyter Notebook gains access to the Calcium Flex package via Python's import system. Execute the following cell (ctrl+enter) to add the calcium flex analysis functionality to the notebook.
from calciumflexanalysis import calcium_flex as cal
For each plate, the package requires the raw data .txt file and a corresponding plate map. The plate map contains a description of each well of the assay, allowing easy categorisation the data.
Within the function 'calciumflexanalysis.CaFlexPlate', the user must stipulate several constraints, additional (optional) attributes can also be stated.
- raw_data: The raw data .txt file
- plate_map_file: The plate map .csv file, adhering to either the 'short' or 'long' template
- inject: The time at which the agonist (e.g. englerin A) was injected into the assay
- data_type: The type of data file being used, described using the colloquial terms 'new' and 'old'
- map_type: The type of plate map template being used, 'short' or 'long' - defaults to 'short'
- size: Size of the well plate - defaults to 96
- valid: Validates all wells in the assay - defaults to True
- title: Title of the assay
update: if units of 'uM' (or u-anything for that matter) are given in the 'Concentration Units' column, the code will automatically present the u as $ \mu $.
# text file to be read in (raw file from machine)
datafile = 'data_example 3 nM to 3 uM.txt'
# plate map csv file updated by user (either the 'short' or 'long' template)
mapcsv = 'plate-map_example 3nM to 3 uM.csv'
# insert the 2 files into the 'CaFlexAnalysis' class
# it is recommended the user name the plates via the system 'plate1', 'plate2', etc.
plate1 = cal.CaFlexPlate(raw_data = datafile, plate_map_file = mapcsv, inject = 60, data_type = 'new')
Two functions assist the user in checking the arrangment of the well plate and for any anomalous readings.
'see_plate' allows the user to quickly confirm whether they have correctly updated the plate map template. The label and color coding can be customised using the attributes 'labelby' and 'colorby', defaulting to 'Type'.
plate1.see_plate(colorby = 'Concentration', labelby = 'Contents')
'visualise assay', like see_plate(), provides a simple visual check for the user. The function is different, and more useful, in that it plots the raw data for each well across the assay. Again, the user can stipulate the arguments 'colorby' and 'labelby'. The user must also explicitly state whether the y axis will be shared across the entire assay, using 'share_y = True' or 'share_y = False'.
plate1.visualise_assay(share_y = True, colorby = 'Concentration')
If you would like more information about a specific function, simply call help on it as follows:
help(cal.CaFlexPlate.see_plate)
Note how you first call the package 'cal', then the class object - in this instance 'CaFlexPlate' - and then the module you require more information on 'see_plate'.
Visualising the assay allows the user to note any anomalous recordings. The user can then choose to have a closer look at the dodgy data and invalidate if necessary.
'see_wells()' plots the raw data of specific wells from the assay. The user can again label, color and share the y axes of the plots as they see fit.
# specific wells that look dodgy
dodgy = "C2", "C3"
plate1.see_wells(dodgy)
The user can invalidate individual wells or entire rows and columns.
plate1.invalidate_wells(["C2", "C3"])
plate1.invalidate_rows('B')
plate1.invalidate_cols(6)
Visualising the assays will clearly show which wells are now invalidated.
plate1.visualise_assay(labelby = 'Concentration', share_y = True)
plate1.baseline_correct()
The program can automatically find the flattest mean gradient across the plate within a 10 point window for the following plateau calculations. The user can stipulate what data the window will be calculated from, either 'ratio' or 'baseline_corrected'
plate1.get_window('baseline_corrected')
If the user does not deem the calculated window suitable, it is possible to manually define the point from which the response amplitudes will be calculated. The user must state the start time point and the data from which the window will be taken, 'ratio' or 'baseline_corrected'.
# # uncomment to manually set the window
# plate1.def_window(200, 'baseline_corrected')
plate1.window # .window returns the index from which the window will be taken
'plot_conditions' plots the change in calcium flux for each mean condition versus time. The user can define which data will be plotted, either 'ratio' or 'baseline_corrected'. The window from which the response amplitude will be calculated can also be shown on the graph, using show_window = True.
Multiple controls can be plotted on the graph using the argument control (default = ['control']). Make sure this is added as a list, e.g. control = ['control-1', 'control-2']. Controls can be blocking from being plotted using show_control = False.
plate1.plot_conditions('baseline_corrected', activator = "EA (30 nM)", title = 'TRPC5-SYFP2', show_window = True)
update: There is now an option to set 'unique markers'. Simply add the argument unique_markers = True. The markers used can be changed by updating the 'markers_list' attribute, default = ["o", "^", "s", "D", "p", "*", "v"]. See https://matplotlib.org/3.1.0/api/markers_api.html for information on the markers that can be used.
plate1.plot_conditions('ratio', activator = "EA (30 nM)", title = 'TRPC5-SYFP2', show_window = False,
unique_markers = True, show_control = False)
'amplitude()' calculates the response amplitude of each well. Again, the user must specify which data they want to use.
note: This does not baseline correct the data, that is achieved using the baseline_correct function (see above). You must carry out the baseline correction before calculating baseline corrected amplitudes.
plate1.amplitude('baseline_corrected')
If the user desires, they can normalise their data to the plate control (where Type = 'control' on your plate map).
plate1.normalise()
Amplitudes for each condition are collected and averaged. The user can chose whether they use the normalised data.
plate1.mean_amplitude(use_normalised = True)
Dose-response curves fitted to either an IC$_{50}$ or EC$_{50}$ can now be plotted using the mean amplitudes. The user must state whether they want to do an 'ic50' or 'ec50' fit. The function also includes several optional arguments:
- combine = True: plots multiple compounds and/or proteins on the same figure, combine = False plots each curve on a separate figure
- error_bar = True: reveals error bars at each concentration
- title: sets the figure title
- show_top_bot = True: reveals the top and bottom values from the curve fitting function
- The user can explicitly stipulate which proteins and or/compounds are plotted using proteins = [list of proteins] and compounds = [list of compounds] (n.b. the user does not have to state this if they want to plot every protein/compound)
- activator: The agonist injected into the assay. This is especially useful when presenting normalised data; the y axis label will contain the agonist name if stated.
plate1.plot_curve('ic50', activator = '30 nM EA', title = "TRPC5-SYFP2", use_normalised = True)
This method will export data produced at each stage of the analysis to an excel document. The data sets are placed into separate sheets.
The title of the excel file can be set as an argument within the export_data function. If no title is specified, the resulting document will be named after the title of the plate (the title of the plate can be set in the first step - see top of the page).
# plate1.export_data("test_export.xlsx")
It is also possible to group together several assays for a combined analysis. This uses the 'multiplate' subpackage - if you want to call help on one of these functions, call 'help(mp.CaFlexGroup.insert_function_here)'.
Normally this would be done at the start of the notebook.
from calciumflexanalysis import multiplate as mp
As before, the user must upload the raw text file and a corresponding plate map for each well plate.
# text files to be read in
datafile = 'data_example 3 nM to 3 uM.txt'
datafile2 = 'data_example_multiple_compounds.txt'
# plate map csv file updated by user (either the 'short' or 'long' template)
mapcsv = 'plate-map_example 3nM to 3 uM.csv'
mapcsv2 = 'plate-map_example_multiple_compounds.csv'
# insert the 2 files into the 'CaFlexAnalysis' class
plate1 = cal.CaFlexPlate(raw_data = datafile, plate_map_file = mapcsv, inject = 60, data_type = 'new')
plate2 = cal.CaFlexPlate(raw_data = datafile2, plate_map_file = mapcsv2, inject = 60, data_type = 'old')
The resulting CaFlexPlate objects can then be added to a 'CaFlexGroup', which allows the user to do grouped operations on all plates simultaneously.
plates = mp.CaFlexGroup([plate1, plate2])
Most of the operations performed for a single plate are essentially the same as those for the resulting CaFlexGroup object, albeit with some minor alterations!
The user may find it useful to know what the title of each plate is. These are automatically named after the raw data files, however can be retitled using simple dictionary indexing. This sort of indexing to access individual plates can be applied elsewhere in the package.
plates.titles
plates.titles['plate_1'] # indexing 'plate_1' accesses its title.
The user can then simply set the indexed plate title to another string:
plates.titles['plate_1'] = 'my favourite plate' # reset the title
plates.titles # show all titles
These are the sister functions of visualise_assay() and see_plate(). The same arguments such as 'share_y', 'labelby' and 'colorby' apply here.
plates.see_plates(title = 'I owe Lawrence a pint', colorby = 'Concentration')
plates.visualise_plates(share_y = False, title = "So much time saved!", dpi = 120)
As aforementioned, this works in much the same way as for the analysis of an individual plate, however it is important to note a couple of minor differences. Some functions, like 'see_plates' and 'visualise_plates' essentially do their job by performing their function on each plate instance in the group. Other functions, as described below, may collate data from each plate and perform a grouped operation.
This will become handy, for example, if the user would like to analyse data for an individual compound that is spread over multiple plates.
This will baseline correct each plate separately. If you want to have a look at the numbers, the baseline corrected ata can be accessed at self.data['baselinecorrected][plate(insert number)]. For defining the window, grouping data, plotting conditions, and calculating the amplitudes, you will have a choice to use 'baseline_corrected' or 'ratio'.
plates.baseline_correct()
Finds the lowest overall mean gradient across a ten time point window post injection for the plates.
plates.get_window('baseline_corrected')
Manually sets each plateau window.
plates.def_window(210, 'baseline_corrected')
This plots each mean condition for each plate as well as each protein and compound, versus time.
With this and the following plotting conditions, the user has the option to plot which protein/compound combinations they would like to plot. All this requires is adding the attribute proteins = [insert_name, insert_name] or compounds = [insert_name]. The square brackets must be used and multiple names can be inserted.
plates.plot_conditions('ratio', show_window = True, title = "TRPC5-Collins", error = False)
Calculates the response amplitude versus time. This function will simultaneously collate the data to self.data['plateau']['data'] as well as updating each plate object. This makes it handy for the user, giving the option to either keep each plate separate or combine the data.
plates.amplitude('baseline_corrected')
This is the first function where the user can decide whether to combine their data or not. I would always recommend combining the data (which is the function's default option) as the user is still able to separate the data for each plate at a later stage. The function spits out a nice tableshowing the amplitudes and errors at each concentration.
Set use_normalised = True to calculate the normalised amps.
plates.mean_amplitude()
This will normalise the data calculated by the function amplitude()
Normalise also defaults to combine = True
Normalises to the mean control.
note: make sure to set use_normalised = True at every subsequent step if you want to use the normalised data.
Set combine = False when calculating mean amplitudes from the normalised data if you want to normalise to each of the plate's controls. Combine = True will normalise over the mean control over all the plates. Subsequently executing plot_curve(combine = True) will plot the combined data, with each plate normalised to its own control.
plates.normalise(combine = True)
plates.mean_amplitude(use_normalised = True, combine = True)
This function gives the user a choice whether to combine the data from all the plates, and also combine each protein/compound onto a plot. For example, if the user has a compound spread over the multiple plates, they can either combine their data into a single IC50 or plot two, by setting combine_plates = True or False, respectively.
If there is more than one compound/protein across the plates, these can be plotted the same or separate graphs by setting combine = True or False, respectively. Have a play around with the combine_plates and combine settings!
Currently if you want to plot data that is normalised to each plate, the only way to plot it is to set combine_plates = False. combine_plates = True will only plot data that is normalised over all the plates.
plates.plot_curve('ic50', combine_plates = True, combine = True, use_normalised = True, activator = 'EA 30 nM', title = 'TRPC5-SYFP2')
As for a single plate, this method will export data produced at each stage of the analysis to an excel document. The data sets are placed into separate sheets.
The title of the excel file can be set as an argument within the export_data function. If no title is specified, the resulting document will be named after the title of each plate (the title of the plate can be set in the first step - see top of the page).
There is also a second argument 'combine_plates'. Setting this to False will produce separate excel files for each plate. If combine_plates is set to True, a single Excel document will be created containing the data for every plate. Separate sheets will be produced for each protein-compound combination across the plates.
# plates.export_data(combine_plates = True)
EC50's can be straightforwardly plotted.
data = 'activator_example_data.txt'
plate_map = 'activator_example_map.csv'
plate3 = cal.CaFlexPlate(raw_data = data, plate_map_file = plate_map, inject = 60, data_type = 'old')
There is now the option to plot multiple controls (set in the 'Type' column of the plate map).
default = ['control']
note: make sure to insert any new control arguments in a list using square brackets.
plate3.plot_conditions('ratio', activator = "activator", title = "TRPC-SYFP2", control = ['control', 'control2'])
plate3.baseline_correct()
plate3.get_window('baseline_corrected')
plate3.amplitude('baseline_corrected')
plate3.mean_amplitude()
plate3.plot_curve('ec50', title = 'TRPC-SYFP2')