Fran wrote up an excellent walk-thru for connecting R to R-friendly flow package to FlowJo. In sum, this allows you to read gate info from FlowJo and analyze the contents in R.
source("http://bioconductor.org/biocLite.R")
biocLite("flowCore")
library("flowCore")
biocLite("flowUtils")
library("flowUtils")
biocLite("flowViz")
library("flowViz")
fcs = read.FCS("file.fcs")
print(summary(fcs))
plot(fcs)
plot(fcs, c("parm1", "parm2"))
flowEnv = new.env()
read.gatingML("path/to/gates.xml", flowEnv)
gate = filter(fcs, flowEnv$"gateID which will actually be a series of numbers")
plot(gatedPop, c("parm1", "parm2")
fcsfilename = "5206.001"
fcs <- read.FCS(fcsfilename)
print(summary(fcs) )
flowEnv = new.env()
read.gatingML("test2_gates.xml", flowEnv)
print(ls(flowEnv))
result=filter(fcs, flowEnv$"82940748")
print(summary(result))
parm1Name = "FSC-H"
parm2Name = "SSC-H"
#plot(fcs, c(parm1Name, parm2Name))
plot(Subset(fcs, result), c(parm1Name, parm2Name))
Downloaded the flowFlowJo source from here: http://www.bioconductor.org/packages/2.4/bioc/html/flowFlowJo.html
To load the source as a package in R.app:
From the "Packages & Data" menu select: "Package Installer"
From the combo box select "Local Package Directory"
I selected the "At User Level" check box, which once done, had to be done every time I went throught this process, otherwise the last version installed at user level would take precedence
Click the uppermost of the two bottom right buttons, which in my installation is labeled "Install,A(with umlout)(funny p thing that demarks new line in word)"
This will bring up a select package directory file browser dialog, navigate to the flowFlowJo directory and click "Open"
From the "Packages & Data" menu select: "Package Manager"
The topmost row in the R Package Manager dialog is flowFlowJo with an unchecked check box with the label, "not loaded": select this checkbox.
When I made changes to the source and wanted to reload them into my R session I found I had to close R, reopen it and perform these steps again. I could not find a quicker / more convenient way. I created an Automator.app "watch me" task that clicked all the appropriate buttons for me.
To test if flowFlowJo was able to read the gates, in the R console I executed:
testList <- readFlowJoList("WORKSPACEPATH")
z <- getFlowJoGates(testList, fileNamePatterns=c("FCSFILENAMEWITHOUTTHE.FCSATTHEEND "))
summary(z)
z$filter
The first problem was that the XML parser didn't like that the wsp XML document didn't contain definitions for the prefixes it was using i.e. it had nodes and attributes named things like "gating:PolygonGate" and the someString:name pattern is a special syntax in XML that means that "name" is defined in the namespace of "someString" and "someString" must be defined in (I believe the root node of) the document referring to a URI, something like xmlns:gating="http://www.isac-net.org/std/Gating-ML/v1.5/gating" otherwise when the parser gets to gating:Polygon it says "hey, what is this gating namespace of which you speak?"
The second problem was that when using xPath (which R does if using its XML module) you can define nodes in the tree using Strings like ".//gating:vertex/gating:coordinate[1]" which in this case means "all nodes matching the pattern first child called coordinate defined under the namespace I've defined as "gating", under child vertex, under this node." (see http://www.w3schools.com/Xpath/xpath_syntax.asp orhttp://www.zvon.org/xxl/XPathTutorial/General/examples.html). This was simple enough, but the addition of namespaces made it slightly more complicated. The XML parser also needs to know which namespaces arae being referred to, and for some reason doesn't seem to want to read the ones defined in the XML document, so you must tell it which namespaces to use and give them your own names (in the example above it is not necessary to call the gating namespace "gating" because that is how it is defined in the XML document, the parsing code can call it anything it wants.) So the namespaces had to be defined in the R code, here the R syntax differs from straight xpath syntax.
ns = c(gating="http://www.isac-net.org/std/Gating-ML/v1.5/gating", datatype="http://www.isac-net.org/std/Gating-ML/v1.5/datatypes"); axesNames = unlist(xpathApply(aNode, ".//gating:dimension/datatype:parameter", xmlGetAttr,"data-type:name",namespaces = ns));
or
polygonList <- getNodeSet(parentSamp, ".//SampleNode/Subpopulations/Population/Gate/gating:PolygonGate", c(gating="http://www.isac-net.org/std/Gating-ML/v1.5/gating"));
It is worth noting here that it is not possible to declare a variable "data-type" in R, "data" seems to be a magic word.
The flowFlowJo package contains a file named NAMESPACE which declares which methds etc are to be brought in from the various other R packages used, so when I wished to use the method "getDefaultNamespace from the XML package it was necessary to modify this to include the declaration for that method i.e.
importFrom(XML,
docName,
getNodeSet,
xmlAncestors,
xmlChildren,
xmlGetAttr,
xmlName,
xmlParent,
xmlRoot,
xmlSize,
xmlTreeParse,
xpathApply)
became
importFrom(XML,
docName,
getDefaultNamespace,
getNodeSet,
xmlAncestors,
xmlChildren,
xmlGetAttr,
xmlName,
xmlParent,
xmlRoot,
xmlSize,
xmlTreeParse,
xpathApply)
Comments