Calcium Flex Analysis using Python¶

Author Lawrence Collins

A step-by-step guide on how to analyse flex data via the use of a bespoke Python package. This notebook contains all the information on how to get started using the software; with further details contained in the documentation.

If you are new to the program I would recommend going through this notebook and executing each cell containing the code one at a time. Have a play around with the functions to get a feel for how things work. All of the code is hidden behind the scenes leaving the user with a selection of simple commands that carry out highly customisable operations on their assay data.

Single plate processing¶

Importing the package¶

The Jupyter Notebook gains access to the Calcium Flex package via Python's import system. Execute the following cell (ctrl+enter) to add the calcium flex analysis functionality to the notebook.

from calciumflexanalysis import calcium_flex as cal

Uploading Flex data¶

For each plate, the package requires the raw data .txt file and a corresponding plate map. The plate map contains a description of each well of the assay, allowing easy categorisation the data.

Within the function 'calciumflexanalysis.CaFlexPlate', the user must stipulate several constraints, additional (optional) attributes can also be stated.

Mandatory¶

 - raw_data: The raw data .txt file
 - plate_map_file: The plate map .csv file, adhering to either the 'short' or 'long' template
 - inject: The time at which the agonist (e.g. englerin A) was injected into the assay
 - data_type: The type of data file being used, described using the colloquial terms 'new' and 'old'

Optional¶

 - map_type: The type of plate map template being used, 'short' or 'long' - defaults to 'short'
 - size: Size of the well plate - defaults to 96
 - valid: Validates all wells in the assay - defaults to True
 - title: Title of the assay

update: if units of 'uM' (or u-anything for that matter) are given in the 'Concentration Units' column, the code will automatically present the u as $ \mu $.

# text file to be read in (raw file from machine)
datafile = 'data_example 3 nM to 3 uM.txt' 

# plate map csv file updated by user (either the 'short' or 'long' template)
mapcsv = 'plate-map_example 3nM to 3 uM.csv' 

# insert the 2 files into the 'CaFlexAnalysis' class 
# it is recommended the user name the plates via the system 'plate1', 'plate2', etc.
plate1 = cal.CaFlexPlate(raw_data = datafile, plate_map_file = mapcsv, inject = 60, data_type = 'new')

Uploaded!

Visual inspection¶

Two functions assist the user in checking the arrangment of the well plate and for any anomalous readings.

see_plate()¶

'see_plate' allows the user to quickly confirm whether they have correctly updated the plate map template. The label and color coding can be customised using the attributes 'labelby' and 'colorby', defaulting to 'Type'.

plate1.see_plate(colorby = 'Concentration', labelby = 'Contents')

visualise_assay()¶

'visualise assay', like see_plate(), provides a simple visual check for the user. The function is different, and more useful, in that it plots the raw data for each well across the assay. Again, the user can stipulate the arguments 'colorby' and 'labelby'. The user must also explicitly state whether the y axis will be shared across the entire assay, using 'share_y = True' or 'share_y = False'.

plate1.visualise_assay(share_y = True, colorby = 'Concentration')

a quick side note: calling for help¶

If you would like more information about a specific function, simply call help on it as follows:

help(cal.CaFlexPlate.see_plate)

Help on function see_plate in module calciumflexanalysis.calcium_flex:

see_plate(self, title='', export=False, cmap='Paired', colorby='Type', labelby='Type', dpi=150)
    Returns a visual representation of the plate map.
    
    The label and colour for each well can be customised to be a variable, for example 'Compound', 'Protein', 'Concentration', 'Concentration Units', 'Contents' or 'Type'. The size of the plate map used to generate the figure can be either 6, 12, 24, 48, 96 or 384. 
    :param size: Size of platemap, 6, 12, 24, 48, 96 or 384, default = 96
    :type size: int    
    :param export: If 'True' a .png file of the figure is saved, default = False
    :type export: bool
    :param title: Sets the title of the figure, optional
    :type title: str
    :param cmap: Sets the colormap for the color-coding, default = 'Paired'
    :type cmap: str
    :param colorby: Chooses the parameter to color code by, for example 'Type', 'Contents', 'Concentration', 'Compound', 'Protein', 'Concentration Units', default = 'Type'
    :type colorby: str
    :param labelby: Chooses the parameter to label code by, for example 'Type', 'Contents', 'Concentration', 'Compound', 'Protein', 'Concentration Units', default = 'Type'
    :type labelby: str
    :param dpi: Size of the figure, default = 150
    :type dpi: int
    :return: Visual representation of the plate map.
    :rtype: figure

Note how you first call the package 'cal', then the class object - in this instance 'CaFlexPlate' - and then the module you require more information on 'see_plate'.

Invalidation¶

Visualising the assay allows the user to note any anomalous recordings. The user can then choose to have a closer look at the dodgy data and invalidate if necessary.

see_wells()¶

'see_wells()' plots the raw data of specific wells from the assay. The user can again label, color and share the y axes of the plots as they see fit.

# specific wells that look dodgy
dodgy = "C2", "C3"
plate1.see_wells(dodgy)

The user can invalidate individual wells or entire rows and columns.

plate1.invalidate_wells(["C2", "C3"])

plate1.invalidate_rows('B')

plate1.invalidate_cols(6)

['C2', 'C3'] invalidated
Row B invalidated
Columns 6 invalidated

Visualising the assays will clearly show which wells are now invalidated.

plate1.visualise_assay(labelby = 'Concentration', share_y = True)

Data Analysis¶

Baseline Correction: baseline_correct()¶

The user can now baseline correct the data if they wish.

plate1.baseline_correct()

Baseline corrected! See self.processed_data['baseline_corrected']

get_window()¶

The program can automatically find the flattest mean gradient across the plate within a 10 point window for the following plateau calculations. The user can stipulate what data the window will be calculated from, either 'ratio' or 'baseline_corrected'

plate1.get_window('baseline_corrected')

def_window()¶

If the user does not deem the calculated window suitable, it is possible to manually define the point from which the response amplitudes will be calculated. The user must state the start time point and the data from which the window will be taken, 'ratio' or 'baseline_corrected'.

# # uncomment to manually set the window
# plate1.def_window(200, 'baseline_corrected')
plate1.window # .window returns the index from which the window will be taken

(41, 51)

plot_conditions()¶

'plot_conditions' plots the change in calcium flux for each mean condition versus time. The user can define which data will be plotted, either 'ratio' or 'baseline_corrected'. The window from which the response amplitude will be calculated can also be shown on the graph, using show_window = True.
Multiple controls can be plotted on the graph using the argument control (default = ['control']). Make sure this is added as a list, e.g. control = ['control-1', 'control-2']. Controls can be blocking from being plotted using show_control = False.

plate1.plot_conditions('baseline_corrected', activator = "EA (30 nM)", title = 'TRPC5-SYFP2', show_window = True)

update: There is now an option to set 'unique markers'. Simply add the argument unique_markers = True. The markers used can be changed by updating the 'markers_list' attribute, default = ["o", "^", "s", "D", "p", "*", "v"]. See https://matplotlib.org/3.1.0/api/markers_api.html for information on the markers that can be used.

plate1.plot_conditions('ratio', activator = "EA (30 nM)", title = 'TRPC5-SYFP2', show_window = False, 
                       unique_markers = True, show_control = False)

Amplitudes¶

'amplitude()' calculates the response amplitude of each well. Again, the user must specify which data they want to use.
note: This does not baseline correct the data, that is achieved using the baseline_correct function (see above). You must carry out the baseline correction before calculating baseline corrected amplitudes.

plate1.amplitude('baseline_corrected')

Amplitudes calculated from baseline_corrected. See self.processed_data['plateau']['data']

Normalisation¶

If the user desires, they can normalise their data to the plate control (where Type = 'control' on your plate map).

plate1.normalise()

C:\Users\lawre\anaconda3\lib\site-packages\pandas\core\frame.py:3997: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,

mean_amplitude()¶

Amplitudes for each condition are collected and averaged. The user can chose whether they use the normalised data.

plate1.mean_amplitude(use_normalised = True)

C:\Users\lawre\anaconda3\lib\site-packages\pandas\core\frame.py:3997: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,

Curve fitting: plot_curve()¶

Dose-response curves fitted to either an IC$_{50}$ or EC$_{50}$ can now be plotted using the mean amplitudes. The user must state whether they want to do an 'ic50' or 'ec50' fit. The function also includes several optional arguments:

 - combine = True: plots multiple compounds and/or proteins on the same figure, combine = False plots each curve on a separate figure
 - error_bar = True: reveals error bars at each concentration
 - title: sets the figure title
 - show_top_bot = True: reveals the top and bottom values from the curve fitting function
 - The user can explicitly stipulate which proteins and or/compounds are plotted using proteins = [list of proteins] and compounds = [list of compounds] (n.b. the user does not have to state this if they want to plot every protein/compound)
 - activator: The agonist injected into the assay. This is especially useful when presenting normalised data; the y axis label will contain the agonist name if stated.

plate1.plot_curve('ic50', activator = '30 nM EA', title = "TRPC5-SYFP2", use_normalised = True)

C:\Users\lawre\anaconda3\lib\site-packages\pandas\core\frame.py:3997: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,
C:\Users\lawre\anaconda3\lib\site-packages\calciumflexanalysis\calcium_flex.py:47: RuntimeWarning: invalid value encountered in power
  z=(ic50/x)**hill

export_data()¶

This method will export data produced at each stage of the analysis to an excel document. The data sets are placed into separate sheets.
The title of the excel file can be set as an argument within the export_data function. If no title is specified, the resulting document will be named after the title of the plate (the title of the plate can be set in the first step - see top of the page).

# plate1.export_data("test_export.xlsx")

Multi-plate processing¶

It is also possible to group together several assays for a combined analysis. This uses the 'multiplate' subpackage - if you want to call help on one of these functions, call 'help(mp.CaFlexGroup.insert_function_here)'.

Importing the package¶

Normally this would be done at the start of the notebook.

from calciumflexanalysis import multiplate as mp

Uploading the flex data¶

As before, the user must upload the raw text file and a corresponding plate map for each well plate.

# text files to be read in 
datafile = 'data_example 3 nM to 3 uM.txt' 
datafile2 = 'data_example_multiple_compounds.txt'
# plate map csv file updated by user (either the 'short' or 'long' template)
mapcsv = 'plate-map_example 3nM to 3 uM.csv' 
mapcsv2 = 'plate-map_example_multiple_compounds.csv'
# insert the 2 files into the 'CaFlexAnalysis' class
plate1 = cal.CaFlexPlate(raw_data = datafile, plate_map_file = mapcsv, inject = 60, data_type = 'new')
plate2 = cal.CaFlexPlate(raw_data = datafile2, plate_map_file = mapcsv2, inject = 60, data_type = 'old')

Uploaded!
Uploaded!

The resulting CaFlexPlate objects can then be added to a 'CaFlexGroup', which allows the user to do grouped operations on all plates simultaneously.

plates = mp.CaFlexGroup([plate1, plate2])

Most of the operations performed for a single plate are essentially the same as those for the resulting CaFlexGroup object, albeit with some minor alterations!

Visual Inspection¶

titles and accessing individual plates¶

The user may find it useful to know what the title of each plate is. These are automatically named after the raw data files, however can be retitled using simple dictionary indexing. This sort of indexing to access individual plates can be applied elsewhere in the package.

plates.titles

{'plate_1': 'data_example 3 nM to 3 uM',
 'plate_2': 'data_example_multiple_compounds'}

plates.titles['plate_1'] # indexing 'plate_1' accesses its title.

'data_example 3 nM to 3 uM'

The user can then simply set the indexed plate title to another string:

plates.titles['plate_1'] = 'my favourite plate' # reset the title
plates.titles # show all titles

{'plate_1': 'my favourite plate', 'plate_2': 'data_example_multiple_compounds'}

visualise_plates() and see_plates()¶

These are the sister functions of visualise_assay() and see_plate(). The same arguments such as 'share_y', 'labelby' and 'colorby' apply here.

plates.see_plates(title = 'I owe Lawrence a pint', colorby = 'Concentration')

plates.visualise_plates(share_y = False, title = "So much time saved!", dpi = 120)

Data analysis of multiple plates¶

As aforementioned, this works in much the same way as for the analysis of an individual plate, however it is important to note a couple of minor differences. Some functions, like 'see_plates' and 'visualise_plates' essentially do their job by performing their function on each plate instance in the group. Other functions, as described below, may collate data from each plate and perform a grouped operation.
This will become handy, for example, if the user would like to analyse data for an individual compound that is spread over multiple plates.

baseline_correct()¶

This will baseline correct each plate separately. If you want to have a look at the numbers, the baseline corrected ata can be accessed at self.data['baselinecorrected][plate(insert number)]. For defining the window, grouping data, plotting conditions, and calculating the amplitudes, you will have a choice to use 'baseline_corrected' or 'ratio'.

plates.baseline_correct()

Baseline corrected! See self.processed_data['baseline_corrected']
Plate 1
Baseline corrected! See self.processed_data['baseline_corrected']
Plate 2

get_window()¶

Finds the lowest overall mean gradient across a ten time point window post injection for the plates.

plates.get_window('baseline_corrected')

(46, 56)

def_window()¶

Manually sets each plateau window.

plates.def_window(210, 'baseline_corrected')

all windows equal, self.window updated

(42, 52)

plot_conditions()¶

This plots each mean condition for each plate as well as each protein and compound, versus time.

With this and the following plotting conditions, the user has the option to plot which protein/compound combinations they would like to plot. All this requires is adding the attribute proteins = [insert_name, insert_name] or compounds = [insert_name]. The square brackets must be used and multiple names can be inserted.

plates.plot_conditions('ratio', show_window = True, title = "TRPC5-Collins", error = False)

amplitude()¶

Calculates the response amplitude versus time. This function will simultaneously collate the data to self.data['plateau']['data'] as well as updating each plate object. This makes it handy for the user, giving the option to either keep each plate separate or combine the data.

plates.amplitude('baseline_corrected')

Amplitudes calculated from baseline_corrected. See self.processed_data['plateau']['data']
self.processed_data['plateau']['data'] updated for plate 1.
Amplitudes calculated from baseline_corrected. See self.processed_data['plateau']['data']
self.processed_data['plateau']['data'] updated for plate 2.
self.data updated. See self.data[baseline_corrected]['grouped']

mean_amplitude()¶

This is the first function where the user can decide whether to combine their data or not. I would always recommend combining the data (which is the function's default option) as the user is still able to separate the data for each plate at a later stage. The function spits out a nice tableshowing the amplitudes and errors at each concentration.
Set use_normalised = True to calculate the normalised amps.

plates.mean_amplitude()

normalise()¶

This will normalise the data calculated by the function amplitude()
Normalise also defaults to combine = True

Normalises to the mean control.

note: make sure to set use_normalised = True at every subsequent step if you want to use the normalised data.
Set combine = False when calculating mean amplitudes from the normalised data if you want to normalise to each of the plate's controls. Combine = True will normalise over the mean control over all the plates. Subsequently executing plot_curve(combine = True) will plot the combined data, with each plate normalised to its own control.

plates.normalise(combine = True)

Collated data normalised to mean control. See self.data['plateau']['data_normed']

plates.mean_amplitude(use_normalised = True, combine = True)

plot_curve()¶

This function gives the user a choice whether to combine the data from all the plates, and also combine each protein/compound onto a plot. For example, if the user has a compound spread over the multiple plates, they can either combine their data into a single IC50 or plot two, by setting combine_plates = True or False, respectively.
If there is more than one compound/protein across the plates, these can be plotted the same or separate graphs by setting combine = True or False, respectively. Have a play around with the combine_plates and combine settings!

Currently if you want to plot data that is normalised to each plate, the only way to plot it is to set combine_plates = False. combine_plates = True will only plot data that is normalised over all the plates.

plates.plot_curve('ic50', combine_plates = True, combine = True, use_normalised = True, activator = 'EA 30 nM', title = 'TRPC5-SYFP2')

C:\Users\lawre\anaconda3\lib\site-packages\calciumflexanalysis\calcium_flex.py:47: RuntimeWarning: invalid value encountered in power
  z=(ic50/x)**hill

export_data()¶

As for a single plate, this method will export data produced at each stage of the analysis to an excel document. The data sets are placed into separate sheets.
The title of the excel file can be set as an argument within the export_data function. If no title is specified, the resulting document will be named after the title of each plate (the title of the plate can be set in the first step - see top of the page).
There is also a second argument 'combine_plates'. Setting this to False will produce separate excel files for each plate. If combine_plates is set to True, a single Excel document will be created containing the data for every plate. Separate sheets will be produced for each protein-compound combination across the plates.

# plates.export_data(combine_plates = True)

Activator Assays¶

EC50's can be straightforwardly plotted.

data = 'activator_example_data.txt'
plate_map = 'activator_example_map.csv'
plate3 = cal.CaFlexPlate(raw_data = data, plate_map_file = plate_map, inject = 60, data_type = 'old')

Uploaded!

There is now the option to plot multiple controls (set in the 'Type' column of the plate map).
default = ['control']
note: make sure to insert any new control arguments in a list using square brackets.

plate3.plot_conditions('ratio', activator = "activator", title = "TRPC-SYFP2", control = ['control', 'control2'])

plate3.baseline_correct()

Baseline corrected! See self.processed_data['baseline_corrected']

plate3.get_window('baseline_corrected')

plate3.amplitude('baseline_corrected')

Amplitudes calculated from baseline_corrected. See self.processed_data['plateau']['data']

plate3.mean_amplitude()

plate3.plot_curve('ec50', title = 'TRPC-SYFP2')

C:\Users\lawre\anaconda3\lib\site-packages\calciumflexanalysis\calcium_flex.py:43: RuntimeWarning: invalid value encountered in power
  z=(ec50/x)**hill

	amps_normed
A1	107.649390
A2	104.355784
A3	92.559173
A4	95.762802
A5	99.672850
...	...
H8	NaN
H9	NaN
H10	NaN
H11	NaN
H12	NaN

	Protein	Type	Compound	Concentration	Concentration Units	amps_normed	amps_normed_error
0	-1	empty	-1	-1.0	-1	NaN	NaN
1	TRPC5	compound	Inhibitor	10.0	nM	85.855872	6.377952
2	TRPC5	compound	Inhibitor	30.0	nM	75.903861	2.232150
3	TRPC5	compound	Inhibitor	100.0	nM	60.313038	3.126894
4	TRPC5	compound	Inhibitor	300.0	nM	37.801207	3.054347
5	TRPC5	compound	Inhibitor	1000.0	nM	21.043494	2.455948
6	TRPC5	compound	Inhibitor	3000.0	nM	15.623952	2.931612
7	TRPC5	control	-1	-1.0	nM	100.000000	2.746933

	Well ID	Type	Contents	Compound	Protein	Concentration	Concentration Units	Row	Column	Valid	Amplitude
0	A1	control	Activator+DMSO	none	TRPC5	none	nM	A	1	True	2.521769
1	A2	control	Activator+DMSO	none	TRPC5	none	nM	A	2	True	2.445457
2	A3	control	Activator+DMSO	none	TRPC5	none	nM	A	3	True	2.094952
3	A4	control	Activator+DMSO	none	TRPC5	none	nM	A	4	True	2.186409
4	A5	control	Activator+DMSO	none	TRPC5	none	nM	A	5	True	2.236160
...	...	...	...	...	...	...	...	...	...	...	...
187	H8	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	8	True	1.028408
188	H9	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	9	True	0.580241
189	H10	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	10	True	0.647634
190	H11	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	11	True	1.120601
191	H12	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	12	True	0.970461

	Protein	Type	Compound	Concentration	Concentration Units	Amplitude	Amplitude Error
0	TRPC5	compound	Inhibitor	3	nM	2.109437	0.063790
1	TRPC5	compound	Inhibitor	10	nM	1.818791	0.139581
2	TRPC5	compound	Inhibitor	30	nM	1.711808	0.055430
3	TRPC5	compound	Inhibitor	100	nM	1.333943	0.075432
4	TRPC5	compound	Inhibitor	300	nM	0.840459	0.054235
5	TRPC5	compound	Inhibitor	1000	nM	0.455551	0.047928
6	TRPC5	compound	Inhibitor	3000	nM	0.342732	0.055724
7	TRPC5	compound	compound_1	1	nM	1.216861	0.034730
8	TRPC5	compound	compound_1	3	nM	1.109066	0.031909
9	TRPC5	compound	compound_1	10	nM	0.841987	0.030580
10	TRPC5	compound	compound_1	30	nM	0.420972	0.029127
11	TRPC5	compound	compound_1	100	nM	0.017994	0.004049
12	TRPC5	compound	compound_1	300	nM	0.008708	0.006856
13	TRPC5	compound	compound_2	1	nM	0.801967	0.110768
14	TRPC5	compound	compound_2	3	nM	0.999166	0.058639
15	TRPC5	compound	compound_2	10	nM	0.649391	0.099005
16	TRPC5	compound	compound_2	30	nM	0.033789	0.022595
17	TRPC5	compound	compound_2	100	nM	-0.002239	0.004872
18	TRPC5	compound	compound_2	300	nM	-0.002768	0.004437
19	TRPC5	control	EA	none	nM	1.347638	0.016135
20	TRPC5	control	none	none	nM	2.303757	0.066007
21	none	blank	none	none	none	0.258315	0.262201
22	none	empty	none	none	none	NaN	NaN

	Well ID	Type	Contents	Compound	Protein	Concentration	Concentration Units	Row	Column	Valid	amps_normed
0	A1	control	Activator+DMSO	none	TRPC5	none	nM	A	1	True	138.126335
1	A2	control	Activator+DMSO	none	TRPC5	none	nM	A	2	True	133.946448
2	A3	control	Activator+DMSO	none	TRPC5	none	nM	A	3	True	114.748055
3	A4	control	Activator+DMSO	none	TRPC5	none	nM	A	4	True	119.757442
4	A5	control	Activator+DMSO	none	TRPC5	none	nM	A	5	True	122.482485
...	...	...	...	...	...	...	...	...	...	...	...
187	H8	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	8	True	56.329606
188	H9	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	9	True	31.781896
189	H10	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	10	True	35.473218
190	H11	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	11	True	61.379364
191	H12	compound	EA(30 nM) +IBP23 (1 nM)	compound_2	TRPC5	1	nM	H	12	True	53.155638

	Protein	Type	Compound	Concentration	Concentration Units	Amplitude	Amplitude Error
0	-1	empty	-1	-1.0	-1	NaN	NaN
1	TRPC5-SYFP2	compound	activator	1.0	nM	0.056646	0.026389
2	TRPC5-SYFP2	compound	activator	3.0	nM	0.232241	0.083927
3	TRPC5-SYFP2	compound	activator	10.0	nM	0.625596	0.150927
4	TRPC5-SYFP2	compound	activator	30.0	nM	1.207932	0.180265
5	TRPC5-SYFP2	compound	activator	100.0	nM	1.633562	0.102630
6	TRPC5-SYFP2	compound	activator	300.0	nM	1.517087	0.079105
7	TRPC5-SYFP2	control	-1	0.0	nM	-0.010598	0.033528
8	TRPC5-SYFP2	control2	EA	10.0	nM	1.633284	0.107221