<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>English version of bioinformatyk.eu</title>
	<atom:link href="http://en.bioinformatyk.eu/feed" rel="self" type="application/rss+xml" />
	<link>http://en.bioinformatyk.eu</link>
	<description>Bioinformatics, algorithms and computational biology</description>
	<lastBuildDate>Sat, 25 Feb 2012 14:49:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Biomolecular Machines Laboratory is looking for MSc and PhD students</title>
		<link>http://en.bioinformatyk.eu/offers/university-positions/biomolecular-machines-laboratory-is-looking-for-msc-and-phd-students.html</link>
		<comments>http://en.bioinformatyk.eu/offers/university-positions/biomolecular-machines-laboratory-is-looking-for-msc-and-phd-students.html#comments</comments>
		<pubDate>Sat, 25 Feb 2012 14:42:34 +0000</pubDate>
		<dc:creator>Justyna Wojtczak</dc:creator>
				<category><![CDATA[University positions]]></category>
		<category><![CDATA[offers for MSc]]></category>
		<category><![CDATA[offers for PhD]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2441</guid>
		<description><![CDATA[Biomolecular Machines Laboratory Centre of New Technologies, University of Warsaw is looking for enthusiastic researchers to work on a project:...]]></description>
			<content:encoded><![CDATA[<p><strong>Biomolecular Machines Laboratory</strong><br />
<strong> Centre of New Technologies, University of Warsaw</strong></p>
<p>is looking for enthusiastic researchers to work on a project:</p>
<p><strong>“Antisense peptide nucleic acids as inhibitors of bacterial translation”</strong></p>
<p>The long-term goal of this project is to apply both computational and experimental techniques to design specific nucleic acid mimics that will target the ribosomal RNA and inhibit bacterial translation. The research is funded as a TEAM project within the Innovative Economy Operational Programme (project period: November 2009 — October 2013). For more information see <a href="http://bionano.cent.edu.pl. ">http://bionano.cent.edu.pl</a></p>
<p>We are looking for:</p>
<p><strong>PhD student:</strong><br />
background in (bio)physics, (bio)chemistry, mathematics or related discipline, candidates must already be in or enter an official PhD program at the time their project is started, net monthly fellowship <strong>3000 PLN</strong> (not subject to income tax) + benefits, awarded from the Team project of the Foundation for Polish Science till October 31st 2013 (possible future funding extension after this date from other funds).</p>
<p><strong>MSc student:</strong><br />
three years of college completed by the time their project is started, background in physics, chemistry, mathematics, computer science or related discipline, net monthly fellowship <strong>1000 PLN</strong> (not subject to income tax), fellowship available up to October 31st 2013.</p>
<p>You will join a group of scientists whose research bridges the fields of physics, mathematics and life sciences. Collaborative opportunities exist with research groups at University of California, San Diego, University of Virginia, Charlottesville, NEST, Scuola Normale Superiore, Pisa and Masaryk University in Brno.</p>
<p>If you are interested in joining the group please send your CV and a cover letter (both in English) indicating the position (MSc or PhD student) you are applying for (preferred format is PDF) to Joanna Trylska joanna@cent.uw.edu.pl.</p>
<p>Applications (in English) should contain if applicable, research experience, participation in conferences, list of publications, cumulative average grade, and in case of PhD students the name of one scientist willing to issue the recommendation letter.<br />
Top candidates will be invited for an interview (or a conference-call) and asked to present documents confirming their degrees. Good command of English is a must.</p>
<p>Applications will be accepted until <strong>February 26th, 2012</strong>.</p>
<p><em>Please include in your offer:</em><br />
<em>&#8220;In accordance with the personal data protection act from 29th August 1997, I hereby agree to process and to store my</em><em> personal data by the Institution for recruitment purposes.</em>”</p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/offers/university-positions/biomolecular-machines-laboratory-is-looking-for-msc-and-phd-students.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>EMBL &#8211; Software Engineer (2 positions)</title>
		<link>http://en.bioinformatyk.eu/offers/job-positions/embl-software-engineer-2-positions.html</link>
		<comments>http://en.bioinformatyk.eu/offers/job-positions/embl-software-engineer-2-positions.html#comments</comments>
		<pubDate>Thu, 19 Jan 2012 19:41:15 +0000</pubDate>
		<dc:creator>Justyna Wojtczak</dc:creator>
				<category><![CDATA[Job positions]]></category>
		<category><![CDATA[job]]></category>
		<category><![CDATA[offers for BSc]]></category>
		<category><![CDATA[offers for MSc]]></category>
		<category><![CDATA[offers for PhD]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2436</guid>
		<description><![CDATA[We are looking for a Software Engineer to join the Atlas development team at the European Bioinformatics Institute (EBI) located...]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><img class="aligncenter" title="EMBL" src="https://ig14.i-grasp.com/docs/images/660/12/1/general_header_it.jpg" alt="" width="398" height="76" /></p>
<p>We are looking for a Software Engineer to join the Atlas development   team at the European Bioinformatics Institute (EBI) located on the   Wellcome Trust Genome Campus near Cambridge in the UK.</p>
<p>The ideal candidate will possess the following technical skills:</p>
<ul>
<li>
<div>Excellent Java programming skills: Servlets, Spring, Hibernate</div>
</li>
<li>
<div>Strong experience with server-side programming or strong HTML/JS/JQuery/CSS experience</div>
</li>
<li>
<div>Strong RDBMS development experience</div>
</li>
<li>
<div>Experience with web-based graphics (HTML5 Canvas, SVG) is a plus</div>
</li>
<li>
<div>Familiarity with the Apache Lucene text search engine is a plus</div>
</li>
<li>
<div>Experience in developing and optimizing web applications for performance</div>
</li>
<li>
<div>Experience  in distributed computing and multithreaded programming is a plus, as is  experience in working on open-source projects.</div>
</li>
</ul>
<p>More information: <a href="http://ig14.i-grasp.com/fe/tpl_embl01.asp?s=CxgIfLQnAyPBgDdPyv&amp;jobid=47359,8758998787&amp;xkey=44708634&amp;c=149861219899&amp;pagestamp=dbbekufjzmyrivbgvb">EMBL Careers</a></p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/offers/job-positions/embl-software-engineer-2-positions.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Writing contest results!</title>
		<link>http://en.bioinformatyk.eu/contest-articles/writing-contest-results.html</link>
		<comments>http://en.bioinformatyk.eu/contest-articles/writing-contest-results.html#comments</comments>
		<pubDate>Sat, 03 Dec 2011 05:37:04 +0000</pubDate>
		<dc:creator>Justyna Wojtczak</dc:creator>
				<category><![CDATA[Contest articles]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2406</guid>
		<description><![CDATA[Hello everyone, now it&#8217;s time to share contest results ! (we apologize for the slight delay) After a stormy debate...]]></description>
			<content:encoded><![CDATA[<p><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/12/banner.jpg"><img class="aligncenter size-full wp-image-2426" title="banner" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/12/banner.jpg" alt="" width="490" height="245" /></a></p>
<p><span style="color: #ff6600; font-size: medium;">Hello everyone, now it&#8217;s time to share <strong> contest results</strong> ! </span></p>
<p>(we apologize for the slight delay)</p>
<p>After a stormy debate we decided that&#8230;<span style="color: #ff6600; font-size: medium;"> <strong> </strong></span></p>
<p style="text-align: center;"><span style="color: #ff6600; font-size: medium;"><strong>the official winner is</strong> </span><span style="color: #ff6600;"> </span><span style="font-size: large; color: #993300;"><strong>Matthias Galle</strong></span></p>
<p style="text-align: center;"><a title="Do You Speak DNA?" href="http://en.bioinformatyk.eu/contest-articles/do-you-speak-dna.html"><span style="font-size: medium;">Do You Speak DNA?</span></a></p>
<p style="text-align: center;">Congratulations!</p>
<p style="text-align: center;">&nbsp;</p>
<p style="text-align: center;"><span style="color: #ff6600; font-size: medium;"><strong>The 2nd prize</strong>: </span><span style="color: #993300; font-size: medium;"><a title="Connectomics – The Quest for a Map of the Human Brain" href="http://en.bioinformatyk.eu/contest-articles/connectomics-%e2%80%93-the-quest-for-a-map-of-the-human-brain.html">Connectomics – The Quest for a Map of the Human Brain</a> by <strong>Wojciech Czarnecki</strong></span></p>
<p style="text-align: center;"><span style="color: #ff6600; font-size: medium;"> <strong>The 3rd prize</strong>: </span><span style="color: #993300; font-size: medium;"><a title="PHYLODIGM – PHYLOgenetic tree DIGitalisation Manager" href="http://en.bioinformatyk.eu/contest-articles/phylodigm-phylogenetic-tree-digitalisation-manager.html">PHYLODIGM – PHYLOgenetic tree DIGitalisation Manager</a> by <strong>Witold Januszewski</strong></span></p>
<p style="text-align: center;"><span style="color: #ff6600; font-size: medium;"><br />
</span></p>
<p style="text-align: center;"><span style="color: #ff6600;"><strong>And Your&#8217;s choice: <span style="font-size: medium;">reader’s prize:</span></strong><span style="font-size: medium;"> </span></span><span style="font-size: medium; color: #993300;"><a title="Connectomics – The Quest for a Map of the Human Brain" href="http://en.bioinformatyk.eu/contest-articles/connectomics-%e2%80%93-the-quest-for-a-map-of-the-human-brain.html">Connectomics – The Quest for a Map of the Human Brain</a> by <strong>Wojciech Czarnecki</strong></span></p>
<p style="text-align: center;">It gained 12 votes, with maximal average <strong>5.0</strong> points <img src='http://en.bioinformatyk.eu/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p style="text-align: center;">&nbsp;</p>
<p>The winners will receive notification emails.</p>
<p>The jury:<br />
Karen Dowell, <em>The Jackson Laboratory and University of Maine GSBS, Bar Harbor, Maine, USA</em><br />
Nils Gehlenborg, <em>Harvard Medical School, Boston, MA, USA</em><br />
Taylor Raborn, <em>Department of Biology, University Iowa, United States of America</em><br />
Kristian Rother, <em>Adam Mickiewicz University in Poznan, Poland</em><br />
Teresa Szczepińska, <em>Nencki Institute of Experimental Biology, Warsaw, Poland</em><br />
Justyna Wojtczak, <em>Adam Mickiewicz University in Poznan, Polan</em>d</p>
<p style="text-align: center;"><span style="color: #ff6600; font-size: medium;"><strong>Thanks a lot everyone for voting and for Your participating in this event!</strong></span></p>
<p><span style="color: #ff6600; font-size: medium;"><strong><br />
</strong></span></p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/contest-articles/writing-contest-results.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Internship for students in ACIB company, Austria</title>
		<link>http://en.bioinformatyk.eu/offers/university-positions/internship-for-students-in-acib-austria.html</link>
		<comments>http://en.bioinformatyk.eu/offers/university-positions/internship-for-students-in-acib-austria.html#comments</comments>
		<pubDate>Sat, 26 Nov 2011 09:43:56 +0000</pubDate>
		<dc:creator>Justyna Wojtczak</dc:creator>
				<category><![CDATA[Job positions]]></category>
		<category><![CDATA[University positions]]></category>
		<category><![CDATA[Erasmus Program]]></category>
		<category><![CDATA[grants]]></category>
		<category><![CDATA[offers for BSc]]></category>
		<category><![CDATA[offers for students]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2398</guid>
		<description><![CDATA[Student 3rd, 4th or 5th academic year Special field: Bioinormatics, Microbiology, Biochemistry Company: Austrian Centre for Industrial Biotechnology (ACIB) Web...]]></description>
			<content:encoded><![CDATA[<p><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/acib.png"></a><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/acib1.png"><img class="size-full wp-image-2400 alignright" title="acib" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/acib1.png" alt="" width="200" height="141" /></a></p>
<p><strong>Student  3</strong><sup><strong>rd,</strong></sup><strong> 4</strong><sup><strong>th</strong></sup><strong> or 5</strong><sup><strong>th</strong></sup><strong> academic year</strong></p>
<p>Special field: Bioinormatics, Microbiology, Biochemistry</p>
<p>Company: Austrian Centre for Industrial Biotechnology (ACIB)</p>
<p>Web site: <a rel="nofollow" href="http://www.acib.at/"><strong>http://www.acib.at</strong></a></p>
<p>Contact person: Juergen Zanghellini</p>
<p>E-mail: <a href="https://webmail.tugraz.at/horde/imp/message.php?mailbox=INBOX&amp;index=1368">juergen.zanghellini@acib.at</a><br />
Tel.: +43 1 47654 8042;</p>
<p><strong>The company</strong></p>
<p>ACIB is an Austrian company which cares for areas of industrial biotechnology and focuses its research on biocatalysis, enzymes, protein and pharmaceuticals using techniques of chemistry, biology and genetic engineering. “Austrian Centre of Industrial Biotechnology” stands for new production processes with higher economic efficiency and products with improved ecological efficiency, with higher purity and quality.</p>
<p><strong>The mission</strong></p>
<p>We are looking for an engineer/master student with some programming experience. Typically, we use perl, matlab and/or mathematica. However, coding experience is not an prerequisite, as long as he/she will be open for computational work. In our lab she/he will pick up some mathematical/programing skills. We actively collaborate with wet lab scientist, but we do NOT do any wet lab research.</p>
<p lang="en-US">&nbsp;</p>
<p>Two major projects are concerned with:<br />
(*) the computational design of a lap-on-chip device. The sensor will be applied to measure cell biomass.<br />
(*) reconstruction of metabolic network for various e. coli, lactobacillus and pichia strains. These models are used to predict genetic interventions for optimal recombinant protein production in those hosts.</p>
<p lang="en-US">&nbsp;</p>
<p><strong>Your profile</strong></p>
<ul>
<li>Very good 	written and oral English</li>
<li>Knowledge of 	perl, matlab and/or mathematica</li>
<li>Willingness to 	work and to gain knowledge</li>
<li>Good 	communication skills</li>
<li>Ability to work 	independently</li>
</ul>
<p><strong>Offer</strong></p>
<p>We offer a very international and dynamic working environment, which is strongly results oriented. Our research organization is located in Vienna (Austria).</p>
<p lang="en-US">&nbsp;</p>
<p>The offer is addressed to students of 3<sup>rd</sup>,4<sup>th</sup> or 5<sup>th</sup> academic year, who <strong>can go</strong> under the LLP Erasmus Program. No additional payment is granted.  We can provide assistance for looking for accommodation in Vienna, but we do not cover costs for accomodation.  We ensure all needed equipment and internal trainings.</p>
<p lang="en-US">&nbsp;</p>
<p><strong>Applying</strong></p>
<p>If you are interested, please send your application in English, containing CV, cover letter, references from previous jobs or trainings and transcript of records to the following address: <a href="https://webmail.tugraz.at/horde/imp/message.php?mailbox=INBOX&amp;index=1368">juergen.zanghellini@acib.at</a> until <strong>15.12.2011.</strong></p>
<p lang="en-US">&nbsp;</p>
<p><strong>More information about our work you can find on the following website: </strong></p>
<p lang="en-US">&nbsp;</p>
<p><a rel="nofollow" href="http://lamp3.tugraz.at/%7Eacib/index.php/wbPage/wbShow/metabolicmodelling">http://lamp3.tugraz.at/~acib/index.php/wbPage/wbShow/metabolicmodelling</a></p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/offers/university-positions/internship-for-students-in-acib-austria.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Teaching Computers to See: Biological Applications</title>
		<link>http://en.bioinformatyk.eu/contest-articles/teaching-computers-to-see-biological-applications.html</link>
		<comments>http://en.bioinformatyk.eu/contest-articles/teaching-computers-to-see-biological-applications.html#comments</comments>
		<pubDate>Mon, 07 Nov 2011 08:23:28 +0000</pubDate>
		<dc:creator>Wojciech Czarnecki</dc:creator>
				<category><![CDATA[Contest articles]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2365</guid>
		<description><![CDATA[by Wojciech Czarnecki 1. Introduction Human vision is one of the most complex senses, able to effectively solve various, difficult...]]></description>
			<content:encoded><![CDATA[<p><em><span style="font-size: medium">by Wojciech Czarnecki</span></em></p>
<p lang="en-US"><span style="font-size: small"><strong>1. 	Introduction</strong></span></p>
<p style="text-align: justify"><span style="font-size: small">Human vision is one of the most complex senses, able to effectively solve various, difficult problems. Despite many attempts, we still do not know, how visual information is processed, analyzed and stored in our brain. This makes designing and developing algorithms, that can mimic some of its behavior very difficult. But what would happen, if we could, instead of giving the computer strict “step by step” instructions of how to deal with some task, just give it an example of such a process and let it find the accurate algorithm by itself? Sounds a bit sci-fi? But that is exactly what machine learning is about. In this short article I would like to present you some basic ideas and applications of this fascinating field of computer science. </span></p>
<p><span style="font-size: small"><strong>2. 	Machine Learning in biology</strong></span></p>
<p style="text-align: justify"><span style="font-size: small">Methods of artificial intelligence are more and more often used to accomplish complex biological tasks for which no exact algorithms (or models) are known. One can find neural network-based protein secondary structure prediction applications [1], HMM based gene detectors [2] or microarray expression analysis using SVM [3].</span></p>
<p style="text-align: justify"><span style="font-size: small">One of the particular interests is the application of machine learning techniques to various image analysis problems like automatic cell detection and measurement, neuron reconstruction, image segmentation etc.</span></p>
<p lang="en-US"><span style="font-size: small"><strong>3. 	Supervised learning</strong></span></p>
<p style="text-align: justify"><span style="font-size: small">The most common approach for machine learning applied to image analysis tasks is so called </span><span style="font-size: small"><em>supervised learning </em></span><span style="font-size: small">(see Fig. 1)</span><span style="font-size: small"><em>,</em></span><span style="font-size: small"> where in order to generate some model (hypothesis about the world), one has to provide the algorithm with a set of pairs of form </span><span style="font-size: small"><em>(sample input, expected output)</em></span><span style="font-size: small">, which is called a training set. Inputs are (because of the computer architecture) in the form of vectors (lists) of numbers (object features). Of course, in real applications it is an important (and difficult) problem to find the best possible representation of our data [4]. The output of such an algorithm is also a number vector (possibly having a size of one), and usually expresses the likelihood of belonging to some particular (predefined) class (type). Once that step is completed, the main learning algorithm tries to automatically set all the model’s parameters for the best fit to the training set. From this moment, the system is ready for  predicting the correct output value on data not seen previously.</span></p>
<p style="text-align: center"><span style="font-size: small"> </span></p>
<div id="attachment_2366" class="wp-caption aligncenter" style="width: 409px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/supervised.jpg"><img class="size-full wp-image-2366  " src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/supervised.jpg" alt="" width="399" height="462" /></a><p class="wp-caption-text">Figure 1. Diagram of general supervised learning.</p></div>
<p><strong>4. Sample 	biological data application</strong></p>
<p style="text-align: justify"><strong> </strong><span style="font-size: small">Let us consider the problem of detecting the boundaries of cells on some grayscale microscopy image. Our input data is an array of values in range 0-255 and our goal is to produce an array of values 0-1, where 1 means, that in the original image this pixel is a part of the boundary and 0 otherwise.</span></p>
<p style="text-align: justify"><span style="font-size: small">As stated before, to use machine learning for some problem, one needs a good representation of the feature (input) vector. In computer vision problems, the most common methods are:</span></p>
<ul>
<li><span style="font-size: small">Creating 	a vector of some part (e.g. NxN pixels) of the image pixels 	intensity.</span></li>
<li><span style="font-size: small">Statistical 	information about some image region (like median, amplitude of 	values, some energy function etc.).</span></li>
<li><span style="font-size: small">Concatenation 	of two previous vectors.</span></li>
</ul>
<p style="text-align: justify"><span style="font-size: small">See Fig 2. for the example where for a grayscale image representing some small cell, 3x3px, the big input vector (window) is extracted. Such a procedure has to be conducted for each (or at least for many) such subimages.</span></p>
<div id="attachment_2367" class="wp-caption aligncenter" style="width: 433px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/features.jpg"><img class="size-full wp-image-2367   " src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/features.jpg" alt="" width="423" height="243" /></a><p class="wp-caption-text">Figure 2. Sample part of the image, and corresponding feature vector extracted for 3x3 sample size. According to the nature of the used model some further data processing may be required (e.g. normalization of vector dimensions for a neural network).</p></div>
<p style="text-align: justify"><span style="font-size: small">Once </span><span style="font-size: small">the feature vector representation is set, one has to prepare a training set. To do that, sample output is also required (see Fig. 3 for an example).</span></p>
<div id="attachment_2368" class="wp-caption aligncenter" style="width: 314px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/output.jpg"><img class="size-full wp-image-2368" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/output.jpg" alt="" width="304" height="305" /></a><p class="wp-caption-text">Figure 3. Sample boundaries.</p></div>
<p style="text-align: justify"><span style="font-size: small">For each 12 element feature vector (9 pixels intensities and 3 statistical values), there is one desired output value: 1 if the center pixel of considered 3&#215;3 window is placed on the boundary and 0 otherwise.</span></p>
<p style="text-align: justify"><span style="font-size: small">There are many machine learning models, that can be used for such a task (for detailed description of each of them, reasons to apply one or another see </span><span style="font-size: small"><em>Neural Networks and Learning Machines</em></span><span style="font-size: small"> [5]), e.g.</span></p>
<ul>
<li><span style="font-size: small">Linear/Logistic 	regression</span></li>
<li><span style="font-size: small">Neural 	Networks (multi layer, convolutional, recursive)</span></li>
<li><span style="font-size: small">Support 	Vector Machines</span></li>
</ul>
<p style="text-align: justify"><span style="font-size: small">Once training is completed, model can provide a value of the likelihood, that the given input vector encodes the window, centered in the boundary pixel</span><span style="font-size: small"> (see Fig. 4).</span></p>
<p style="text-align: center"><span style="font-size: small"> </span></p>
<div id="attachment_2369" class="wp-caption aligncenter" style="width: 422px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/trained.jpg"><img class="size-full wp-image-2369   " src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/trained.jpg" alt="" width="412" height="184" /></a><p class="wp-caption-text">Figure 4. Example output for three layer neural network trained on 12-dimensional input vectors extracted from 512x512px image.</p></div>
<p style="text-align: justify"><span style="font-size: small">Such a likelihood can be easily converted to the binary value by thresholding </span><span style="font-size: small">it, more strictly speaking, a pixel x is considered a boundary if and only if the trained machine learning model output, for a vector generated from x, is greater or equal to 0.5 (see Fig.5).</span></p>
<div id="attachment_2370" class="wp-caption aligncenter" style="width: 410px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/results.jpg"><img class="size-full wp-image-2370" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/results.jpg" alt="" width="400" height="400" /></a><p class="wp-caption-text">Figure 5. Sample boundaries detection by neural network with 0.5 thresholding</p></div>
<p style="text-align: justify"><span style="font-size: small">This </span><span style="font-size: small">is only a simple example of what can be done using this kind of approach. </span><span style="font-size: small">Scientists use machine learning based methods for various problems connected to computer vision</span><span style="color: #000000"><span style="font-size: small">. </span></span><span style="color: #000000"><span style="font-size: small">Saadia Iftikhar et al. [5] used SVM for endothelial cell boundaries detection of rabbit aortic images, </span></span><span style="color: #000000"><span style="font-size: small"> </span></span><span style="color: #000000"><span style="font-size: small">Srinivas C. Turaga et al. [6] used convolutional neural networks for generation of affinity graphs for neurons images segmentation</span></span><span style="color: #000000"><span style="font-size: small">, Qing Zheng et al. [7] used neural network for automated cell recognition. Possible applications are almost infinite, the only real boundary is the human imagination. </span></span></p>
<p><span style="font-size: small"><strong>5. References</strong></span></p>
<p><span style="font-size: small">[1] D. T. Jones, “Protein secondary structure prediction based on position-specific scoring matrices”, </span><span style="font-size: small"><em>Journal of Molecular Biology</em></span><span style="font-size: small">, vol. 292, no. 2, pp. 195–202, 1999.</span></p>
<p><span style="font-size: small">[2] C. B. Burge, </span><span style="font-size: small"><em>Modeling dependencies in pre-mRNA splicing signals</em></span><span style="font-size: small">, pp. 127–163. 1998.</span></p>
<p><span style="font-size: small">[3] M. P. S. Brown, W. N. Grundy, D. Lin, N. Cristianini, C. W. Sugnet, T. S. Furey, M. Ares, and D. Haussler, “Knowledge-based analysis of microarray gene expression data by using support vector machines”, </span><span style="font-size: small"><em>Proceedings of the National Academy of Sciences of the United States of America</em></span><span style="font-size: small">, vol. 97, no. 1, pp. 262–267, 2000</span></p>
<p><span style="font-size: small">[4] </span><span style="color: #333333"><span style="font-size: small">Nguyen, M. H., &amp; De La Torre, F. (2010). Optimal feature selection for support vector machines.</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>Pattern Recognition</em></span></span><span style="color: #333333"><span style="font-size: small">,</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>43</em></span></span><span style="color: #333333"><span style="font-size: small">(3), 584-591. Elsevier.</span></span></p>
<p><span style="font-size: small">[5] </span><span style="color: #333333"><span style="font-size: small">Haykin, S. (2008).</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>Neural Networks and Learning Machines</em></span></span><span style="color: #333333"><span style="font-size: small">.</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>Pearson Prentice Hall New Jersey USA 936 pLinks</em></span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small">(p. 906). Prentice Hall.</span></span></p>
<p><span style="font-size: small">[6] </span><span style="color: #333333"><span style="font-size: small">Iftikhar, S., Bond, A. R., Wagan, A. I., Weinberg, P. D., &amp; Bharath, A. A. (2011). Segmentation of Endothelial Cell Boundaries of Rabbit Aortic Images Using a Machine Learning Approach.</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>International Journal of Biomedical Imaging</em></span></span><span style="color: #333333"><span style="font-size: small">,</span></span><span style="color: #333333"><span style="font-size: small"><em>2011</em></span></span><span style="color: #333333"><span style="font-size: small">, 270247. Hindawi Publishing Corporation.</span></span></p>
<p><span style="font-size: small">[7] </span><span style="color: #333333"><span style="font-size: small">Turaga, S. C., Murray, J. F., Jain, V., Roth, F., Helmstaedter, M., Briggman, K., Denk, W., et al. (2010). Convolutional networks can learn to generate affinity graphs for image segmentation.</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>Neural Computation</em></span></span><span style="color: #333333"><span style="font-size: small">,</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>22</em></span></span><span style="color: #333333"><span style="font-size: small">(2), 511-38. MIT Press</span></span></p>
<p><span style="font-size: small">[8] </span><span style="color: #333333"><span style="font-size: small">Zheng, Q., Milthorpe, B. K., &amp; Jones, A. S. (2004). Direct neural network application for automated cell recognition.</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>Cytometry Part A the journal of the International Society for Analytical Cytology</em></span></span><span style="color: #333333"><span style="font-size: small">,</span></span><span style="color: #333333"><span style="font-size: small"> </span></span><span style="color: #333333"><span style="font-size: small"><em>57</em></span></span><span style="color: #333333"><span style="font-size: small">(1), 1-9.</span></span></p>
<p>[starrater tpl=45]</p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/contest-articles/teaching-computers-to-see-biological-applications.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Twelve thousands pink dots – bioinformatics story</title>
		<link>http://en.bioinformatyk.eu/contest-articles/twelve-thousands-pink-dots-%e2%80%93-bioinformatics-story.html</link>
		<comments>http://en.bioinformatyk.eu/contest-articles/twelve-thousands-pink-dots-%e2%80%93-bioinformatics-story.html#comments</comments>
		<pubDate>Sun, 06 Nov 2011 22:35:47 +0000</pubDate>
		<dc:creator>Dominika Bijoś</dc:creator>
				<category><![CDATA[Contest articles]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2359</guid>
		<description><![CDATA[by Dominika Anna Bijoś One grey rainy Scottish afternoon I finally collected all my data: twelve thousand pink dots. Few...]]></description>
			<content:encoded><![CDATA[<p><em>by Dominika Anna Bijoś</em></p>
<p>One grey rainy Scottish afternoon I finally collected all my data: twelve thousand pink dots. Few of them were snow white, few of them dark red, but the vast majority was pink: peach pink and salmon pink and baby pink and a 256 other shades of pink that I wouldn’t know &#8217;cause pink is just pink. Except it wasn’t. The degree of pinkness was my key biological information, information decoded in the shade of pink.</p>
<p>My bioinformatic story starts with the pink issue, goes through looking for solutions, learning and discoveries of the new computer science world and ends somewhere at the periphery of the cell nucleus. It’s my own personal journey into the 21<sup>st</sup> century technology and the programming way of thinking; problem solving and team work. Because bioinformatics is like tango, it takes two, the bio(logist) and the computer expert.</p>
<p>Firstly, the pinkness. Fisson yeast, cousins of those little fellows that make bread and beer, can grow in the form of circular colonies. Each cell can divide to produce a whole colony, all of them originating from this one single cell i.e. having the same genetic make-up. Manipulated metabolic pathways in yeast are used to reveal gene expression. In my case if genes were expressed, more modification was happening in the heterochromatic (supposedly inactive) part of the chromosome, and this resulted in a colony with the lightest shade of pink.</p>
<p>I needed tools. I was stubborn: no, I won’t judge twelve thousands pink dots by eye. So I ventured to look for solutions. Using image processing programs to make a grid covering my dots and then figuring out a VisualBasic script to process numbers wasn’t bionformatics quite yet, but it was a good start. There was a line describing what I wanted to be done and it repeated itself for all dots I had: in 5 minutes, for all shades of pink, objectively irrespective of sun or rain outside! For a humble biologist it was a WOW moment.</p>
<p>I was hooked. I signed up for a master course in bioinformatics. It was exciting to venture into the unknown world of command lines and scripts and API’s. It’s amazing how many tools are there! You need to know about them and know how to combine them to work for your benefit. It’s so beautifully different from how biology is always uncertain and difficult to tackle. The rules of loop execution are defined, controlled and the outcome is written to a file. I loved it.</p>
<p>Confession point: it almost killed me. I didn’t HAVE that way of thinking, I had to ACQUIRE it. A first degree in biology doesn’t give you quite the same skill set as a degree in computer science. I went through an uncountable amount of “Aah! This is how it works!” moments. Followed the flowchart of trying things out, googling things out and eventually, after an hour of trying [1], asking someone for help to maintain sanity. If the question concerned how to write a function, answers were there.</p>
<p style="text-align: center;"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/DABijos_12000pinkDots_figure_111031.png"><img class="aligncenter size-full wp-image-2360" title="DABijos_12000pinkDots_figure_111031" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/DABijos_12000pinkDots_figure_111031.png" alt="" width="441" height="331" /></a></p>
<p>It was eye opening. The world is perceived differently in the two worlds I knew: biologists count the letters of the DNA alphabet A, C, G, T as 1, 2, 3, 4. Computer science guys don’t, they see 0, 1, 2, 3. So I had to communicate. Understand both worlds. You don’t have to be perfect in both of them but you need to be able to talk. Genes, chromatin, cell, protein, expression levels, single nucleotide polymorphism (SNPs) and genome wide association studies (GWAS) – if you run away screaming hearing those, don’t bother with bioinformatics. I didn’t know what a loop was when I started. I got over it. And it was worth it: my first database storing yeast strain information, my first Java program analyzing mitochondrial DNA, my MSc on stem cell gene annotation. I discovered and conquered a new world!</p>
<p>I was so thrilled with my pink discoveries [2], I ventured deeper into the nucleus of the cell. My study involved looking at the DNA (chromatin) located at the periphery of the nucleus. Microarrays and later on High Throughput Sequencing data piled up on the server from complicated molecular experiments. And then I was taken back by the scary discovery: “bioinformatics is nothing else than torturing the data until in confesses… and if you torture it enough, you can get it to confess to anything.” (Fred Menger).</p>
<p>Back to the starting point. Initially I was enchanted by how I controlled programming. Now I had to awake my biologist within me: because does the yeast cell, living happily making you beer, care whether it is statistically significant or not? Well, I bet it doesn’t really. The beauty is to discover how things work and it requires more than just tools to do so. You need to dig deep into it and then take 5 steps back and try to draw the big picture. With biological data: you have to be both the biologist and the computer geek. The strict mind which comprehends the analysis issues, confidence intervals, significant changes and the biological mind which includes the biological variability, the third null hypothesis this week and keeps in mind the big picture of the cell not caring to be statistically significant.</p>
<p>When the question is: how the biology works &#8211; Nobody knows the answer. It is up to you and your method of data torture, raised to the highest standards to make sense of how the world of the cell works. You can make it bigger, but I assure you that it is big enough to be able to get lost in it. I am losing myself and finding the way out. Everyday.</p>
<p>Twelve thousands pink dots were the eye opener. I saw bioinformatics as a tool, a challenge, a discovery and the interdisciplinary bridge. I flew into the unknown and learned that it is up to me to discover how it works. And bioinformatics was my favorite step to becoming “curiouser and curiouser” about the world [3].</p>
<p><strong>References:</strong></p>
<ol>
<li>A webcomic of 	romance, sarcasm, math, and language. Tech Support Cheat Sheet:  	<span style="color: #0000ff;"><span style="text-decoration: underline;"><a href="http://xkcd.com/627/">http://xkcd.com/627/</a></span></span></li>
<li><span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Bayne%20EH%22%5BAuthor%5D"><span style="color: #000000;">Bayne 	EH</span></a></span>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22White%20SA%22%5BAuthor%5D"><span style="color: #000000;">White 	SA</span></a></span>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Kagansky%20A%22%5BAuthor%5D"><span style="color: #000000;">Kagansky 	A</span></a></span>, <a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Bijos%20DA%22%5BAuthor%5D"><span style="text-decoration: underline;">Bijos</span><span style="color: #0000ff;"><span style="text-decoration: underline;"><span style="color: #000000;"> DA</span></span></span></a>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Sanchez-Pulido%20L%22%5BAuthor%5D"><span style="color: #000000;">Sanchez-Pulido 	L</span></a></span>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Hoe%20KL%22%5BAuthor%5D"><span style="color: #000000;">Hoe 	KL</span></a></span>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Kim%20DU%22%5BAuthor%5D"><span style="color: #000000;">Kim 	DU</span></a></span>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Park%20HO%22%5BAuthor%5D"><span style="color: #000000;">Park 	HO</span></a></span>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Ponting%20CP%22%5BAuthor%5D"><span style="color: #000000;">Ponting 	CP</span></a></span>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Rappsilber%20J%22%5BAuthor%5D"><span style="color: #000000;">Rappsilber 	J</span></a></span>, <span style="color: #0000ff;"><a href="http://www.ncbi.nlm.nih.gov/pubmed?term=%22Allshire%20RC%22%5BAuthor%5D"><span style="color: #000000;">Allshire 	RC</span></a></span>. Stc1: a critical link between RNAi and 	chromatin modification required for heterochromatin integrity. <span style="color: #0000ff;"><span style="text-decoration: underline;"><a href="http://www.ncbi.nlm.nih.gov/pubmed/20211136"><span style="color: #000000;">Cell.</span></a></span></span> 2010 Mar 5;140(5):666-77.</li>
<li>&#8220;Curiouser and 	curiouser! cried Alice.&#8221; &#8211; Lewis Carroll, Alice in Wonderland</li>
</ol>
<p>[starrater tpl=45]</p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/contest-articles/twelve-thousands-pink-dots-%e2%80%93-bioinformatics-story.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Connectomics – The Quest for a Map of the Human Brain</title>
		<link>http://en.bioinformatyk.eu/contest-articles/connectomics-%e2%80%93-the-quest-for-a-map-of-the-human-brain.html</link>
		<comments>http://en.bioinformatyk.eu/contest-articles/connectomics-%e2%80%93-the-quest-for-a-map-of-the-human-brain.html#comments</comments>
		<pubDate>Sun, 06 Nov 2011 22:09:21 +0000</pubDate>
		<dc:creator>Wojciech Czarnecki</dc:creator>
				<category><![CDATA[Contest articles]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2350</guid>
		<description><![CDATA[by Wojciech Czarnecki 1. Connectome In 2003 The Human Genome Project was completed after twelve years of intense work of...]]></description>
			<content:encoded><![CDATA[<p lang="en-US"><em><span style="font-size: medium;">by </span><span style="font-size: medium;">Wojciech Czarnecki</span></em><span style="font-size: small;"><strong> </strong></span></p>
<p><span style="font-size: small;"><strong>1. Connectome</strong></span></p>
<p style="text-align: justify;"><span style="font-size: small;">In 2003 The Human Genome Project was completed after twelve years of intense work of specialists from all over the world which started the era of genomics. While genes define our inborn characteristics, what is still a mystery is how exactly our knowledge, memories and skills are stored. One possible hypothesis says that this kind of information is encoded in the way neurons are connected in our brain [1]. If such information could be easily obtained for a particular human being, it would be possible for example to easily diagnose mental disorders and, what is equally (or even more) significant – investigate how our brain works. </span></p>
<p style="text-align: justify;"><span style="font-size: small;">As defined by Hagmann [2], the connectome is a set of all connections in a brain, considered as a single entity. So one can view it as a graph where vertices are particular neurons, and the edges are connections between them. So far only one structure of this kind is known – connectome (with 302 vertices and about 7000 edges) of the nematode C. Elegans [3] (see fig. 1).</span></p>
<div id="attachment_2351" class="wp-caption aligncenter" style="width: 410px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/elegans.png"><img class="size-full wp-image-2351" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/elegans.png" alt="" width="400" height="374" /></a><p class="wp-caption-text">Figure 1. C. Elegans connectome visualized with Mathematica.</p></div>
<p style="text-align: justify;"><span style="font-size: small;">The human brain consists of about one hundred billion neurons, accompanied by almost ten thousands more connections between each of them [4]. It is over one million times more than the length of the human genome. Such an enormous size of data leads to many problems:</span></p>
<ul>
<li><span style="font-size: small;">Data </span><span style="font-size: small;">acquisition – to 	achieve the greatest possible accuracy of the process one needs an 	EM/MR images in nano- or micrometer scale of the whole brain.</span></li>
<li><span style="font-size: small;">Storage 	– even if the edges are represented as pairs of integers, one 	would still need about 10 exabytes (10 x 10</span><sup><span style="font-size: small;">15 </span></sup><span style="font-size: small;">bytes) of space 	to store a single connectome (which is about 3% of the world’s 	total stored analog and digital content[5]).</span></li>
<li><span style="font-size: small;">Need 	for fast reconstruction algorithms (of O(n) complexity).</span></li>
<li><span style="font-size: small;">Need 	for methods of statistical analysis of massive graphs.</span></li>
</ul>
<p><span style="font-size: small;"><strong>2. Current 	methods </strong></span></p>
<p style="text-align: justify;"><span style="font-size: small;">One of the possible input data for the connectome problem is a stack of images from electron microscopy (see fig. 2), for which identification of particular neurons is required. In computer science such a problem is called image segmentation – for a given image (or set of images) one needs to decide for each image pixel (voxel) to which class (object) it belongs.</span></p>
<p style="text-align: center;"><span style="font-size: small;"> </span></p>
<div id="attachment_2352" class="wp-caption aligncenter" style="width: 426px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/segmentation.png"><img class="size-full wp-image-2352   " src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/segmentation.png" alt="" width="416" height="135" /></a><p class="wp-caption-text">Figure 2. From left: sample EM image, its boundary labeling and resulting segmentation. For more details see Jain V, Turaga SC, Seung HS. Machines that learn to segment images: a crucial technology of connectomics. Current Opinion in Neurobiology, 2010.</p></div>
<p style="text-align: justify;"><span style="font-size: small;">The first approach to the problem (and currently the only fully successful one) was a manual annotation of the neurons in the microscopy image data. While it was possible to accomplish this for a few hundred neurons, the big brain size of more complex beings (like mammals) require much faster, fully automated methods. </span></p>
<p style="text-align: justify;"><span style="font-size: small;">Most of the current algorithms work in two phases – first, they detect boundaries (the edges of each object), and then simply search for connected components in the image graph, where the edge between two pixels exists if and only if they are adjacent and there is no boundary between them.</span></p>
<p lang="en-US"><span style="font-size: small;">For such an approach, boundary detection can be achieved by using for example:</span></p>
<ul>
<li><span style="font-size: small;">Simple 	edge detectors (</span><span style="font-size: small;">Sobel 	[6], Gaussian based [7], Canny [8]).</span></li>
<li><span style="font-size: small;">Haar 	wavelet transform based method</span><span style="font-size: small;"> [9].</span></li>
<li><span style="font-size: small;">Statistical 	methods</span><span style="font-size: small;"> [10].</span></li>
<li><span style="font-size: small;">Machine 	learning algorithms [11].</span></li>
</ul>
<p style="text-align: justify;"><span style="font-size: small;">Second phase can be easily done by BFS (or DFS) search through the image graph. After that, layer by layer,</span><span style="font-size: small;"> the segmented images can be connected to find out which neurons are connected.</span></p>
<p style="text-align: justify;"><span style="font-size: small;">In more advanced algorithms boundary detection is replaced with so-called affinity graph generation, where instead of labeling pixels “boundary” or “not boundary” the algorithm estimates for each of its neighborhood likelihood that two pixels are in the same region/object (see fig 3.). </span></p>
<div id="attachment_2353" class="wp-caption aligncenter" style="width: 403px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/affinity.png"><img class="size-full wp-image-2353 " src="http://en.bioinformatyk.eu/wp-content/uploads/2011/11/affinity.png" alt="" width="393" height="353" /></a><p class="wp-caption-text">Figure 3. A. Sample EM image, B. Segmentation made by human expert, C. and D. Segmentations based on affinity graphs. For more details see Turaga, S. C., Murray, J. F., Jain, V., Roth, F., Helmstaedter, M., Briggman, K., Denk, W., et al. (2010). Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Computation, 22(2), 511-38. MIT Press.</p></div>
<p><span style="font-size: small;"><strong>3. Alternative 	approaches and c</strong></span><span style="font-size: small;"><strong>urrent 	progress</strong></span></p>
<p style="text-align: justify;"><span style="font-size: small;">There are also</span><span style="font-size: small;"> much simpler formulations of the brain mapping problem. Instead of an exact graph of all neuron connections, one can search for more statistical  information about how groups of neurons or brain regions are connected to each other. Once this is solved, precise neuron-to-neuron mapping can be done independently on each of such structures, ensuring the distributed nature of the whole process.</span></p>
<p style="text-align: justify;"><span style="font-size: small;">As stated in the </span><span style="font-size: small;">section 1. – only one full connectome is known, but because of the major advances in the imaging techniques (especially diffusion magnetic resonance and functional magnetic resonance), some major fiber bundles are reconstructed, and some anatomically and structurally distinct areas are identified.</span></p>
<p><span style="font-size: small;"><strong>4. Future</strong></span></p>
<p style="text-align: justify;"><span style="font-size: small;">There are at least a few projects related to the connectome reconstruction – the Human Connectome Project, the Open Connectome Project, the Mouse Connectome Project, and The Seung Lab at MIT &#8212; each with a different approach to the problem, and different possible outcomes. As for now we, live in the age of genomics, but in just few years from now we might witnesses a the new era – the era of connectomics.</span></p>
<p><span style="font-size: small;">I encourage the reader to follow some of the links placed at the end of this article and watch the inspiring speech by Sebastian Seung, PhD.</span></p>
<p><iframe width="500" height="375" src="http://www.youtube.com/embed/sH9zccNtNlA?fs=1&#038;feature=oembed" frameborder="0" allowfullscreen></iframe></p>
<p><span style="font-size: small;"><strong>5. Useful links</strong></span></p>
<ul>
<li><span style="font-size: small;">C. 	Elegans Connectome</span><span style="font-size: small;"> </span><span style="color: #0000ff;"><span style="text-decoration: underline;"><a href="http://www.wormatlas.org/neuronalwiring.html">http://www.wormatlas.org/neuronalwiring.html</a></span></span></li>
<li><span style="font-size: small;">Human 	Connectome Project </span><span style="color: #0000ff;"><span style="text-decoration: underline;"><a href="http://www.humanconnectomeproject.org/">http://www.humanconnectomeproject.org/</a></span></span></li>
<li><span style="font-size: small;">Open 	Connectome Project </span><a href="http://openconnectomeproject.org/" target="_blank">http://openconnectomeproject.org/</a></li>
<li><span style="font-size: small;">Seung 	Lab</span><span style="font-size: small;"> </span><span style="color: #0000ff;"><span style="text-decoration: underline;"><a href="http://hebb.mit.edu/seunglab/home">http://hebb.mit.edu/seunglab/home</a></span></span></li>
<li><span style="font-size: small;">Mouse 	Connectome Project</span><a href="http://www.mouseconnectome.org/" target="_blank"><span style="font-size: small;"> </span>http://www.mouseconnectome.org/</a></li>
</ul>
<p><span style="font-size: small;"><strong>6. References</strong></span></p>
<p><span style="font-size: small;">[1]</span><span style="color: #333333;"><span style="font-size: small;"> </span></span><span style="color: #333333;"><span style="font-size: small;">Connectomics: Tracing the Wires of the Brain (Dana Foundation) </span></span><span style="color: #0000ff;"><span style="text-decoration: underline;"><a href="http://www.dana.org/news/cerebrum/detail.aspx?id=13758"><span style="font-size: small;">http://www.dana.org/news/cerebrum/detail.aspx?id=13758</span></a></span></span></p>
<p><span style="color: #333333;"><span style="font-size: small;">[2] Hagmann, P. (2005).</span></span><span style="color: #333333;"><span style="font-size: small;"> </span></span><span style="color: #333333;"><span style="font-size: small;"><em>From diffusion mri to brain connectomics</em></span></span><span style="color: #333333;"><span style="font-size: small;">.</span></span><span style="color: #333333;"><span style="font-size: small;"> </span></span><span style="color: #333333;"><span style="font-size: small;"><em>Science</em></span></span><span style="color: #333333;"><span style="font-size: small;">. Institut de traitement des signaux PROGRAMME DOCTORAL EN INFORMATIQUE ET COMMUNICATIONS POUR LʼOBTENTION DU GRADE DE DOCTEUR ÈS SCIENCES PAR Docteur en médecine, Université de Lausanne.</span></span></p>
<p><span style="color: #333333;"><span style="font-size: small;">[3] White, J. G., Southgate, E., Thomson, J. N., &amp; Brenner, S. (1986). The Structure of the Nervous System of the Nematode Caenorhabditis elegans.</span></span><span style="color: #333333;"><span style="font-size: small;"><em>Philosophical Transactions of the Royal Society B Biological Sciences</em></span></span><span style="color: #333333;"><span style="font-size: small;">,</span></span><span style="color: #333333;"><span style="font-size: small;"><em>314</em></span></span><span style="color: #333333;"><span style="font-size: small;">(1165), 1-340. The Royal Society.</span></span></p>
<p><span style="font-size: small;">[4] </span><span style="color: #333333;"><span style="font-size: small;">Drachman, D. A. (2005). Do we have brain to spare?</span></span><span style="color: #333333;"><span style="font-size: small;"> </span></span><span style="color: #333333;"><span style="font-size: small;"><em>Neurology</em></span></span><span style="color: #333333;"><span style="font-size: small;">. </span></span></p>
<p><span style="font-size: small;">[5] Martin Hilbert and Priscila López. The World&#8217;s Technological Capacity to Store, Communicate, and Compute Information. Science, 10 February 2011.</span></p>
<p><span style="font-size: small;">[6] Kittler, J. (1983). On the accuracy of the Sobel edge detector. Image Vision Computing, 1(1), 37-42. </span></p>
<p><span style="font-size: small;">[7] Marr, D., &amp; Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London Series B Containing papers of a Biological character Royal Society Great Britain, 207(1167), 187-217. </span></p>
<p><span style="font-size: small;">[8] Ding, L. (2001). Canny edge detector. Most, 34(3), 721-725. Computer Science and Engineering Department Wright State University. </span></p>
<p><span style="font-size: small;">[9] Heric, D., &amp; Zazula, D. (2007). Combined edge detection using wavelet transform and signal registration. Image and Vision Computing, 25(5), 652-662. ELSEVIER SCIENCE BV. </span></p>
<p><span style="font-size: small;">[10] Cues, E., Konishi, S., Yuille, A. L., Coughlan, J. M., &amp; Zhu, S. C. (2003). Statistical Edge Detection : Learning and Evaluating Statistical Edge Detection : Learning and Evaluating Edge Cues. Analysis, 25(1), 29-36.</span></p>
<p><span style="font-size: small;">[11] Lu, S., Wang, Z., &amp; Shen, J. (2003). Neuro-fuzzy synergism to the intelligent system for edge detection and enhancement. Pattern Recognition, 36(10), 2395-2409. </span></p>
<p>[starrater tpl=45]</p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/contest-articles/connectomics-%e2%80%93-the-quest-for-a-map-of-the-human-brain.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bioinformatic tools for investigating Cas2 &#8211; the CRISPR-associated protein</title>
		<link>http://en.bioinformatyk.eu/contest-articles/bioinformatic-tools-for-investigating-cas2-the-crispr-associated-protein.html</link>
		<comments>http://en.bioinformatyk.eu/contest-articles/bioinformatic-tools-for-investigating-cas2-the-crispr-associated-protein.html#comments</comments>
		<pubDate>Sun, 23 Oct 2011 20:46:11 +0000</pubDate>
		<dc:creator>Aleksandra Grabowska</dc:creator>
				<category><![CDATA[Contest articles]]></category>
		<category><![CDATA[bioinformatic tools]]></category>
		<category><![CDATA[Cas2 protein]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2314</guid>
		<description><![CDATA[by Alexandra Grabowska CRISPR (clustered regularly interspaced short palindromic repeats) is a prokaryotic system of acquired immunity. Its mechanism is...]]></description>
			<content:encoded><![CDATA[<p><em>by Alexandra Grabowska</em></p>
<p>CRISPR (clustered regularly interspaced short palindromic repeats) is a prokaryotic system of acquired immunity. Its mechanism is similar in action to the one of RNAi in eukaryotic organisms and it gained quite a lot of attention in recent years. These adaptive immunity systems are present a great part of already known archaeal and bacterial genomes.</p>
<p>CRISPR loci are built of short direct repeats which are separated by spacers of exogenous origin. During the invasion of phage or a mobile genetic element, invasive nucleic acid can get cleaved by bacterial protein machinery. Part of alien sequence is then incorporated into specific genomic loci and gets transcribed with the rest of CRISPR units, in one long RNA chain. After being cleaved, transcripts are used as guide RNAs to destroy the invasive virus or plasmid. Specificity of this unique immune system relies on the formation of R-loops, which is formed between the RNA-spacer sequence and the complementary target DNA.</p>
<p><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/1.-R-loop.jpg"><img class="aligncenter size-full wp-image-2315" title="1. R-loop" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/1.-R-loop.jpg" alt="" width="422" height="195" /></a></p>
<p>Cas, CRISPR-associated proteins, is a cluster of genes adjacent to arrays of repeat-spacer units. Because direct repeats vary in great degree between bacteria and spacers are unique alien sequences, <em>cas </em>genes can be used to determine the relationships between systems and their origin. Systems were divided into three types and later subtypes. In all of them <em>cas1 </em>and <em>cas2 </em>constitute the universal core of system. Cas1 is a metal-dependent DNase of no sequence specificity and is probably involved in the first step of CRISPR mechanism – integration of the new spacer into genome. Despite its presence in all CRISPR loci, the function of Cas2 remains elusive. Using freely available software and databases I tried to investigate this on my own.</p>
<p>A reliable source of information is the work on Cas2 structure from <em>Desulfovibrio vulgaris </em>(<em>Dvu</em>Cas2), presenting results obtained in crystallographic research. Cas2 in most of bacteria is not a very big protein. <em>Dvu</em>Cas2 is a homodimer protein built of 102 amino acids. Each protomer contains an N-terminal βαββαβ ferredoxin fold joined with the fifth β-strand and a short helix on the C-terminus. The whole dimer is stabilized by hydrogen bonds.</p>
<p style="text-align: center;"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/3.-cas2-pdb.png"><img class="aligncenter size-full wp-image-2316" title="3. cas2 (pdb)" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/3.-cas2-pdb.png" alt="" width="438" height="209" /></a></p>
<p>In studies of Cascade (complex of other CRISPR-associated proteins, not involving Cas2), it was shown that the complex is able to cleave the pre-crRNA transcript <em>in vitro</em>. It means, that Cas2 is not necessary for RNA cleaving or processing in step of CRISPR expression. The study of Beloglazova (2008) showed metal-dependent endonuclease activity of the protein. Homolog from <em>Thermotoga maritima </em>and a few other strains also had RNase activity. Due to these results, the proposed function of Cas2 was selective degradation of phage transcripts or global translation inhibition via mRNA cleavage. Looking for protein motifs using freely available scanner showed no results.</p>
<p>Cas1 and Cas2 are the most conserved proteins in <em>cas </em>cluster. Because of this fact I chose to compare <em>cas2 </em>between chosen strains. In order to investigate some relationship between <em>Dvu</em>Cas2 and other <em>cas2 </em>genes, I used pairwise alignment for <em>cas2 </em>genes from 6 strains, including <em>Desulfovibrio vulgaris. </em>The greatest similarity was detected for <em>Desulfovibrio vulgaris, Haloarcula marismortui </em>and <em>Thermotoga thermarum. </em>The result is quite plausible. In the recent CRISPR classification systems of these strains (former <em>Tneap-Hmari </em>and <em>Dvulg </em>types) were shown to be related, as they shared a common gene of the BH0338 family. Type found in <em>Aeropyrum pernix </em>(former <em>Apern</em>) also shares this gene, but in this simple alignment there was no closer relationship shown.</p>
<p style="text-align: center;"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/2.-tree.png"><img class="aligncenter size-full wp-image-2317" title="2. tree" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/2.-tree.png" alt="" width="462" height="225" /></a></p>
<p>CRISPR system is still quite a new discovery in area of bacterial genetics research. Because of potential interesting applies of the system, a lot of work is done to elucidate the mechanism of action and functions of proteins. Cas2 presence and conservation among different types is the evidence on its important role in functioning of CRISPR. Up till now, it remains elusive and mysterious.</p>
<p><strong>Bibliography:</strong></p>
<p>N. Beloglazova et alteres, A novel family of sequence-specific endoribonucleases associated with the clustered regularly interspaced short palindromic repeats. J. Biol. Chem 2008 Jul 18;283(29):20361-71.</p>
<p>S. J. J. Brouns <em>et alteres</em>, Small CRISPR RNAs Guide Antiviral Defense in Prokaryotes.</p>
<p><em>Science 15 August 2008: 321 (5891), 960-964.</em></p>
<p>P. Samai <em>et alteres</em> (2010), Structure of a CRISPR-associated protein Cas2 from <em>Desulfovibrio vulgaris</em>. Acta Crystallographica Section F, 66: 1552–1556.</p>
<p><strong>Source of Cas2 structure: </strong>RCSB Protein Data Bank</p>
<p>PJ.L. Moreland, A.Gramada, O.V. Buzko, Q. Zhang and P.E. Bourne 2005 The Molecular Biology Toolkit (mbt): A Modular Platform for Developing Molecular Visualization Applications. BMC Bioinformatics, 6:2 1</p>
<p>[starrater tpl=45]</p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/contest-articles/bioinformatic-tools-for-investigating-cas2-the-crispr-associated-protein.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Prediction and application of intrinsically disordered regions in practice</title>
		<link>http://en.bioinformatyk.eu/contest-articles/prediction-and-application-of-intrinsically-disordered-regions-in-practice.html</link>
		<comments>http://en.bioinformatyk.eu/contest-articles/prediction-and-application-of-intrinsically-disordered-regions-in-practice.html#comments</comments>
		<pubDate>Mon, 03 Oct 2011 06:14:39 +0000</pubDate>
		<dc:creator>Adam Górka</dc:creator>
				<category><![CDATA[Contest articles]]></category>
		<category><![CDATA[aminoacids]]></category>
		<category><![CDATA[drug design]]></category>
		<category><![CDATA[MoRFs]]></category>
		<category><![CDATA[proteins]]></category>
		<category><![CDATA[quartet model]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2301</guid>
		<description><![CDATA[by Adam Górka Department of Physical Biochemistry, Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University, Kraków, Poland. e-mail:adam.gorka@uj.edu.pl Protein quartet...]]></description>
			<content:encoded><![CDATA[<h4><strong><em>by Adam Górka</em></strong></h4>
<p>Department of Physical Biochemistry, Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University, Kraków, Poland.</p>
<p><em>e-mail:adam.gorka@uj.edu.pl</em><em> </em></p>
<h2>Protein quartet model</h2>
<p style="text-align: center;">&nbsp;</p>
<div id="attachment_2302" class="wp-caption aligncenter" style="width: 406px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/Figure1-ProteinQuartetModel.jpg"><img class="size-full wp-image-2302   " title="Figure1-ProteinQuartetModel" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/Figure1-ProteinQuartetModel.jpg" alt="" width="396" height="397" /></a><p class="wp-caption-text">Figure 1</p></div>
<p style="text-align: justify;">Proteins are the major components of living cells. As discovered recently proteins and protein regions may exist in at least four different states of structural organization: ordered, molten globule, pre-molten globule, and coil-like (Figure 1). What is more, protein function is associated with any of these distinct states or with transitions between them. Entire proteins or proteins regions in the pre-molten globule state or coil-like state display no unique tertiary structure [1]. These Intrinsically Disordered Proteins (IDPs) and Intrinsically Disordered Regions (IDRs) are involved in key biological processes including transcription regulation, cell cycle control, recognition, and signaling [2].</p>
<h2 style="text-align: justify;">Order vs. Disorder</h2>
<p style="text-align: justify;">In contrast to ordered regions (OR), IDR exist as dynamic ensembles in which atom positions and backbone Ramachandran angles vary significantly over time with no specific equilibrium values. The conformational changes of IDR are typically non-cooperative and random. The structure of IDR is dynamic and changing over time what does not exclude the temporary presence of local functional secondary structure that fluctuates in absence of stabilizing forces [3, 4].</p>
<h2 style="text-align: justify;">How to become a disordered protein</h2>
<p style="text-align: justify;">The amino acid chain of IDPs is characterized by a high mean net charge and low mean hydrophobicity per protein residue. High net charge leads to formation of charge–charge repulsion forces that outweighs low hydrophobic force driving to protein collapse and prevent formation of a stable tertiary structure [5, 6].</p>
<h2 style="text-align: justify;">Amino acid composition of intrinsically disordered regions</h2>
<p style="text-align: justify;">This feature make differences between amino acid composition of ordered and disordered regions in proteins. IDPs are enriched in amino acids that promote disorder including: Q, E, A, R, S, F, G, P [7, 8] and depleted in hydrophobic amino acids that promote hydrophobic collapse: W, F, L, S, V, C, F, N [7, 8]. Further differences also exist between short (&lt; 30 amino acids) and long (&gt;= 30 amino acids) IDRs. Short regions are enriched in G, D and have less I, V, L while the long contain more K, E, P and less Q, G, N [9, 10].</p>
<h2 style="text-align: justify;">Intrinsically disordered regions in numbers</h2>
<p style="text-align: justify;">Bioinformatic analyses of proteins’ amino acid composition in known genomes show that the unstructured regions are much more frequent in eukaryotic than prokaryotic proteins. The Disopred2 algorithm predicts that 33% of eukaryotic proteins and only 2% of archaean proteins, and 4.2% of eubacterial proteins contain more than 30 amino acid long IDRs [11]. A similar prediction with the PONDR VL-XT algorithm showed that more than 30 amino acid long IDRs have 9-57% archaeal proteins, 13-52% bacterial proteins and 48-63% eukaryotic proteins. In contrast, IDRs longer than 50 amino acids were predicted only for less than 10% prokaryotic and archaeal and 25% eukaryotic proteins. [12]. It is estimated that 12% of proteins in eukaryotes are completely disordered [13]. 82-94% of transcription factors have long IDRs [14]. IDRs longer than 30 amino acids also contains 79% proteins associated with cancer and 66% of the proteins involved in signal transduction [15]. Disordered regions are common and present in proteins. IDPs are a good object for bioinformatic protein sequence analysis or molecular modeling.</p>
<h2 style="text-align: justify;">Intrinsically prediction of disordered regions in practice</h2>
<p style="text-align: justify;">In practice the following approach can be used to predict globular regions with secondary structures and intrinsically disordered regions from amino acid sequences. As a first step, one should search for homologous sequences and collect useful literature data about their functional, interacting, and binding regions, posttranslational modifications, domain compositions etc. [13, 16].</p>
<p style="text-align: justify;">Next, to avoid pitfalls one should perform an analysis of homologous sequence composition and complexity, a search for signal peptides, transmembrane regions, leucine zippers, zing fingers, coiled-coil regions, modification sites, known sequence motifs, disulfide bridges, presence of similar crystallographic structures, generate HCA and LCA graphs of sequences etc. [13, 16].</p>
<p style="text-align: justify;">Preparation of homologous sequence alignment is the next step. Alignments have to be supplemented with secondary structure prediction and disorder prediction with few ab initio methods (Secondary structure prediction algoritms: PsiPred, Yaspin, Jpred 3, Sspro, Porter [17], Proteus 2 [18]. Disorder prediction algoritms: GLOBPLOT 2.3, IUPRED 2, PONDR VL-XT, PONDR VSL2, metaPrDOS, Metadisorder [9], PONDR FIT [19]). The conserved sequences, secondary structures and disorder regions have to be specified. Pay attention to sequence deletion and insertions that are likely to occur more often in disordered regions [20]. Consider that disordered regions commonly undergo disorder to order transition upon binding. Such sequences can form non-stable fluctuating secondary structures [13, 16].</p>
<p style="text-align: justify;">In the next step, the domain organization of protein should be proposed. Then the CDF-plot and CH-plot analysis methods can be run to provisionally classify them as ordered, disordered [21] or chameleon morphing sequence regions that could be both [22-24]. Chameleon sequences are potential binding sites. They can be disordered when a protein is isolated and perform a function of molecular recognition features (MoRFs) undergoing disorder to order transition upon binding. MoRFs can adopt diverse secondary structures in different complexes. Their occurrence correlates with ELMs and SLiMs inside long disordered regions [10, 15, 22]</p>
<p style="text-align: justify;">Prediction of MoRFs is a challenge. The first average class algorithms of MoRFs prediction like PONDR VL-XT [25, 26] or ANCHOR [14, 27, 28] are available. New algorithms for identifying different types of MoREs are under development [29, 30]. All collected information from sequence analysis should be compared and verified by available literature and experimental data. Bioinformatic protein sequence analysis of IDPs allow for verification of existing hypotheses and to postulate new one [13, 16].</p>
<h2 style="text-align: justify;">Application of intrinsic disorder</h2>
<p style="text-align: justify;">&nbsp;</p>
<div class="mceTemp mceIEcenter" style="text-align: justify;">
<dl id="attachment_2303" class="wp-caption aligncenter" style="width: 465px;">
<dt class="wp-caption-dt"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/Figure2-DrugMimicsIDR.jpg"><img class="size-full wp-image-2303   " title="Figure2-DrugMimicsIDR" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/10/Figure2-DrugMimicsIDR.jpg" alt="" width="455" height="245" /></a></dt>
<dd class="wp-caption-dd">Figure 2</dd>
</dl>
</div>
<p style="text-align: justify;">What is more, identification of MoRFs in IDPs can have more practical application. Knowledge of MoRFs can be used in rational drug design. MoRFs are short peptides (about 30 amino acids) inside long disordered regions that bind specifically to a molecular partner. Newly designed drugs can mimic MoRFs or its binding site selectively targeting specific protein-protein interactions (Figure 2). Disordered regions and MoRF prediction has been used by Molecular Kinetics, Inc. to identify 35 781 sequences in the human proteome that possess features of druggability [14, 30, 31].</p>
<h2 style="text-align: justify;">Conclusion</h2>
<p style="text-align: justify;">Prediction of MoRFs in IDRs open new ways in rational drug design because protein‑protein interactions based on disorder-to-order transition of one partner can make ideal druggable targets.</p>
<h2>References:</h2>
<p>1. Uversky VN: <strong>Natively unfolded proteins: a point where biology waits for physics.</strong> <em>Protein science : a publication of the Protein Society</em> 2002, <strong>11</strong>:739-56.</p>
<p>2. Uversky VN: <strong>The mysterious unfoldome: structureless, underappreciated, yet vital part of any given proteome.</strong> <em>Journal of biomedicine &amp; biotechnology</em> 2010, <strong>2010</strong>:568068.</p>
<p>3. Radivojac P, Iakoucheva LM, Oldfield CJ, et al.: <strong>Intrinsic disorder and functional proteomics.</strong> <em>Biophysical journal</em> 2007, <strong>92</strong>:1439-56.</p>
<p>4. Serdyuk IN: <strong>Structured proteins and proteins with intrinsic disorder</strong>. <em>Molecular Biology</em> 2007, <strong>41</strong>:262-277.</p>
<p>5. Uversky VN: <strong>Intrinsically disordered proteins from A to Z.</strong> <em>The international journal of biochemistry &amp; cell biology</em> 2011, <strong>43</strong>:1090-103.</p>
<p>6. Ashbaugh HS, Hatch HW: <strong>Natively unfolded protein stability as a coil-to-globule transition in charge/hydropathy space.</strong> <em>Journal of the American Chemical Society</em> 2008, <strong>130</strong>:9536-42.</p>
<p>7. Dunker AK, Lawson JD, Brown CJ, et al.: <strong>Intrinsically disordered protein.</strong> <em>Journal of molecular graphics &amp; modelling</em> 2001, <strong>19</strong>:26-59.</p>
<p>8. Uversky VN: <strong>What does it mean to be natively unfolded?</strong> <em>European journal of biochemistry / FEBS</em> 2002, <strong>269</strong>:2-12.</p>
<p>9. He B, Wang K, Liu Y, et al.: <strong>Predicting intrinsic disorder in proteins: an overview.</strong> <em>Cell research</em> 2009, <strong>19</strong>:929-49.</p>
<p>10. Xue B, Hsu W-L, Lee J-H, et al.: <strong>SPA: Short peptide analyzer of intrinsic disorder status of short peptides.</strong> <em>Genes to cells : devoted to molecular &amp; cellular mechanisms</em> 2010, <strong>15</strong>:635-46.</p>
<p>11. Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT: <strong>Prediction and functional analysis of native disorder in proteins from the three kingdoms of life.</strong> <em>Journal of molecular biology</em> 2004, <strong>337</strong>:635-645.</p>
<p>12. Chen JW, Romero P, Uversky VN, Dunker AK: <strong>Conservation of intrinsic disorder in protein domains and families: II. functions of conserved disorder.</strong> <em>Journal of proteome research</em> 2006, <strong>5</strong>:888-98.</p>
<p>13. Bourhis JM, Canard B, Longhi S: <strong>Predicting protein disorder and induced folding: from theoretical principles to practical applications.</strong> <em>Current protein &amp; peptide science</em> 2007, <strong>8</strong>:135-149.</p>
<p>14. Dunker a K, Uversky VN: <strong>Drugs for “protein clouds”: targeting intrinsically disordered transcription factors.</strong> <em>Current opinion in pharmacology</em> 2010, <strong>10</strong>:782-8.</p>
<p>15. Oldfield CJ, Cheng Y, Cortese MS, et al.: <strong>Coupled folding and binding with alpha-helix-forming molecular recognition elements.</strong> <em>Biochemistry</em> 2005, <strong>44</strong>:12454-70.</p>
<p>16. Ferron F, Longhi S, Canard B, Karlin D: <strong>A practical overview of protein disorder prediction methods.</strong> <em>Proteins</em> 2006, <strong>65</strong>:1-14.</p>
<p>17. Pirovano W, Heringa J: <strong>Protein secondary structure prediction.</strong> <em>Methods in molecular biology (Clifton, N.J.)</em> 2010, <strong>609</strong>:327-48.</p>
<p>18. Montgomerie S, Cruz J a, Shrivastava S, et al.: <strong>PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotation.</strong> <em>Nucleic acids research</em> 2008, <strong>36</strong>:W202-9.</p>
<p>19. Xue B, Dunbrack RL, Williams RW, Dunker a K, Uversky VN: <strong>PONDR-FIT: a meta-predictor of intrinsically disordered amino acids.</strong> <em>Biochimica et biophysica acta</em> 2010, <strong>1804</strong>:996-1010.</p>
<p>20. Brown CJ, Johnson AK, Dunker a K, Daughdrill GW: <strong>Evolution and disorder.</strong> <em>Current opinion in structural biology</em> 2011, <strong>21</strong>:441-6.</p>
<p>21. Uversky VN, Dunker a K: <strong>Understanding protein non-folding.</strong> <em>Biochimica et biophysica acta</em> 2010, <strong>1804</strong>:1231-64.</p>
<p>22. Uversky VN: <strong>Multitude of binding modes attainable by intrinsically disordered proteins: a portrait gallery of disorder-based complexes.</strong> <em>Chemical Society reviews</em> 2011, <strong>40</strong>:1623-34.</p>
<p>23. Rost B, Eyrich VA: <strong>EVA: large-scale analysis of secondary structure prediction.</strong> <em>Proteins</em> 2001, <strong>Suppl 5</strong>:192-9.</p>
<p>24. Uversky VN: <strong>Intrinsically disordered proteins may escape unwanted interactions via functional misfolding.</strong> <em>Biochimica et biophysica acta</em> 2011, <strong>1814</strong>:693-712.</p>
<p>25. Mohan A, Oldfield CJ, Radivojac P, et al.: <strong>Analysis of molecular recognition features (MoRFs).</strong> <em>Journal of molecular biology</em> 2006, <strong>362</strong>:1043-59.</p>
<p>26. Vacic V, Oldfield CJ, Mohan A, et al.: <strong>Characterization of molecular recognition features, MoRFs, and their binding partners.</strong> <em>Journal of proteome research</em> 2007, <strong>6</strong>:2351-66.</p>
<p>27. Mészáros B, Simon I, Dosztányi Z: <strong>Prediction of protein binding regions in disordered proteins.</strong> <em>PLoS computational biology</em> 2009, <strong>5</strong>:e1000376.</p>
<p>28. Dosztányi Z, Mészáros B, Simon I: <strong>ANCHOR: web server for predicting protein binding regions in disordered proteins.</strong> <em>Bioinformatics (Oxford, England)</em> 2009, <strong>25</strong>:2745-6.</p>
<p>29. Cheng Y, Oldfield CJ, Meng J, et al.: <strong>Mining alpha-helix-forming molecular recognition features with cross species sequence alignments.</strong> <em>Biochemistry</em> 2007, <strong>46</strong>:13468-77.</p>
<p>30. Cheng Y, LeGall T, Oldfield CJ, et al.: <strong>Rational drug design via intrinsically disordered protein.</strong> <em>Trends in biotechnology</em> 2006, <strong>24</strong>:435-42.</p>
<p>31. Metallo SJ: <strong>Intrinsically disordered proteins are potential drug targets.</strong> <em>Current opinion in chemical biology</em> 2010, <strong>14</strong>:481-8.</p>
<p>[starrater tpl=45]</p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/contest-articles/prediction-and-application-of-intrinsically-disordered-regions-in-practice.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PHYLODIGM – PHYLOgenetic tree DIGitalisation Manager</title>
		<link>http://en.bioinformatyk.eu/contest-articles/phylodigm-phylogenetic-tree-digitalisation-manager.html</link>
		<comments>http://en.bioinformatyk.eu/contest-articles/phylodigm-phylogenetic-tree-digitalisation-manager.html#comments</comments>
		<pubDate>Fri, 23 Sep 2011 11:33:00 +0000</pubDate>
		<dc:creator>Witold Januszewski</dc:creator>
				<category><![CDATA[Contest articles]]></category>
		<category><![CDATA[digitalisation]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[image processing]]></category>
		<category><![CDATA[Phylogenetic trees]]></category>
		<category><![CDATA[trees]]></category>

		<guid isPermaLink="false">http://en.bioinformatyk.eu/?p=2269</guid>
		<description><![CDATA[by Witold Januszewski Summary The article presents an original application for automated phylogenetic tree digitalisation. PHYLODIGM – PHYLOgenetic tree DIGitalisation...]]></description>
			<content:encoded><![CDATA[<p><em><strong>by Witold Januszewski</strong><strong> </strong></em></p>
<p><strong>Summary</strong></p>
<p style="text-align: justify;">The article presents an original application for automated phylogenetic tree digitalisation. PHYLODIGM – PHYLOgenetic tree DIGitalisation Manager uses a set of image processing and recognition methods so as to build an acyclic graph with accompanying Newick format description, the latter being an interpunction-based phylogeny description standard.</p>
<p><strong>Introduction</strong></p>
<p style="text-align: justify;">Phylogenetic trees have proved optimal branching diagrams for presenting the findings of molecular evolution. The formal analysis of phylogenetic trees has revealed their acyclic graph structure<a href="#_msocom_1">[k1]</a> . Each node of  a phylogenetic tree corresponds to one species, whereas the distance between two nodes represents the evolutionary distance between two species.</p>
<p style="text-align: justify;">These properties of phylogenetic trees were used to create the Newick format in 1986, which is an interpunction-based description format for graph-theoretical trees.  The format provides a text method for describing parent nodes, leaf nodes and the distances between them, thus reducing the size of digital phylogram repositories.</p>
<p style="text-align: justify;">Newly, phylogram reconstruction programs handling the Newick format such as TreeSnatcher, TreeRipper or Dendroscope have been introduced. Henceforth, creation of a comprehensive and immediate phylogenetic tree digitalisation method has become a significant issue in bioinformatics.<a href="#_msocom_2">[k2]</a></p>
<p><strong>Methods</strong></p>
<p style="text-align: justify;">PHYLODIGM – PHYLOgenetic tree DIGitalisation Manager was conceived as a Java application, which, firstly, follows the principle of multiplatform Object-Oriented Programming, secondly, allows producing mobile versions of the programme for portable camera-bearing or image-reading devices such as mobile phones or certain palmtops, thirdly, alleviates testing and providing extension modules such as reasoners or additional image processing methods. These paradigms extend the use cases of PHYLODIGM considerably.</p>
<p><span style="text-decoration: underline;">Preprocessing</span></p>
<p style="text-align: center;"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig11.png"><img class="aligncenter size-full wp-image-2270" title="fig1" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig11.png" alt="" width="443" height="241" /></a></p>
<p style="text-align: justify;">After the source tree image (see Fig. 1a and 1b.) has been uploaded by the user via the &#8216;Acquire Image&#8217; command button, the initial stage of phylogenetic tree digitalisation, namely the preprocessing, is ready to begin (see GUI snapshot in the Fig A.). Following the &#8216;Execute Preprocessing&#8217; command PHYLODIGM converts the source image to .PNG format and trims image borders (where existent) for processing efficiency.</p>
<div id="attachment_2271" class="wp-caption aligncenter" style="width: 359px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/FigA-Preprocessing-GUI-Snapshot.png"><img class="size-full wp-image-2271 " title="FigA Preprocessing (GUI Snapshot)" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/FigA-Preprocessing-GUI-Snapshot.png" alt="" width="349" height="279" /></a><p class="wp-caption-text">Fig.A: Preprocessing (GUI Snapshot)</p></div>
<p><span style="text-decoration: underline;">Junction detection and edge extraction</span></p>
<p style="text-align: justify;">Prior to junction detection and edge extraction the source image undergoes binary segmentation and appropriate morphological operations:  erosion and skeletonisation (see Fig. 2-3).  PHYLODIGM proceeds with Hit-and-Miss (HMT) morphological pattern matching. The method locates and returns all pixel positions of the image matrix that match the provided kernel pattern.  PHYLODIGM runs HMT twice, first with three meeting lines kernel pattern to retrieve junctions (Fig. 4), next with free line end kernel pattern to retrieve the pixels at the edge endings (Fig. 5).  The user can either accept the automated processing or use the drawing GUI tools (Fig B.) to add the missing tips and edges or delete the misrecognised ones.</p>
<div id="attachment_2272" class="wp-caption aligncenter" style="width: 480px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig22.png"><img class="size-full wp-image-2272 " title="fig2" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig22.png" alt="" width="470" height="255" /></a><p class="wp-caption-text">Fig.2</p></div>
<p style="text-align: center;">&nbsp;</p>
<div id="attachment_2273" class="wp-caption aligncenter" style="width: 480px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig32.png"><img class="size-full wp-image-2273  " title="fig3" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig32.png" alt="" width="470" height="255" /></a><p class="wp-caption-text">Fig.3</p></div>
<div id="attachment_2274" class="wp-caption aligncenter" style="width: 480px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig4.png"><img class="size-full wp-image-2274  " title="fig4" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig4.png" alt="" width="470" height="255" /></a><p class="wp-caption-text">Fig.4</p></div>
<div id="attachment_2277" class="wp-caption aligncenter" style="width: 480px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig5.png"><img class="size-full wp-image-2277  " title="fig5" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/fig5.png" alt="" width="470" height="255" /></a><p class="wp-caption-text">Fig.5</p></div>
<div id="attachment_2278" class="wp-caption aligncenter" style="width: 359px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/FigB-Junction-detection-and-edge-extraction-GUI-Snapshot.png"><img class="size-full wp-image-2278 " title="FigB Junction detection and edge extraction (GUI Snapshot)" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/FigB-Junction-detection-and-edge-extraction-GUI-Snapshot.png" alt="" width="349" height="279" /></a><p class="wp-caption-text">Fig.B: Junction detection and edge extraction (GUI Snapshot)</p></div>
<p><span style="text-decoration: underline;">Node linkage</span></p>
<p style="text-align: justify;">The detected junctions become the acyclical graph nodes and undergo a linkage process based on linear interpolation according to the formulas shown on the graphic (Fig C.). The value of a connected pixel  <em>P0</em> is calculated according to the neighbour pixel values <em>P31</em> and <em>P42</em>, in respect to the distances:</p>
<p>- <em>dx</em>: between <em>P0</em> and the center of the line linking <em>P1</em> and <em>P3</em></p>
<p>- <em>dy</em>: between <em>P0</em> and the center of the line linking <em>P2</em> and <em>P4</em></p>
<p><strong> </strong></p>
<p><em> </em></p>
<div id="attachment_2275" class="wp-caption aligncenter" style="width: 424px"><em><em><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/FigC-Node-linkage-method-by-interpolation.png"><img class="size-full wp-image-2275 " title="FigC Node linkage method by interpolation" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/FigC-Node-linkage-method-by-interpolation.png" alt="" width="414" height="146" /></a></em></em><p class="wp-caption-text">Fig.C: Node linkage method by interpolation</p></div>
<p><em> </em><strong>Results</strong></p>
<p style="text-align: justify;">The interpolated image data, after the measurement of leaf node trajectories, forms the resulting digital phylogram according to the grammar rules of the Newick format. We provide both the Newick format and respective graph visualisation as the output (Fig D.). The results may be exported as either raw image, .PDF file or textual Newick format. The same GUI panel enables the user to import a custom Newick format tree description into PHYLODIGM as well. Tests on both rectangular and freeform trees have brought in expected digitalisation results<a href="#_msocom_3">[k3]</a> .</p>
<div id="attachment_2276" class="wp-caption aligncenter" style="width: 358px"><a href="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/FigD-Exporting-the-constructed-graph-GUI-Snapshot.png"><img class="size-full wp-image-2276 " title="FigD Exporting the constructed graph (GUI Snapshot)" src="http://en.bioinformatyk.eu/wp-content/uploads/2011/09/FigD-Exporting-the-constructed-graph-GUI-Snapshot.png" alt="" width="348" height="277" /></a><p class="wp-caption-text">Fig.D: Exporting the constructed graph (GUI Snapshot)</p></div>
<p><strong>Conclusion</strong></p>
<p style="text-align: justify;">The author believes that PHYLODIGM provides an innovative method for automated phylogenetic tree digitalisation. The output phylograms should satisfy the requirements of phylogenetic analysis in a broad sense. In the prospective development phase PHYLODIGM will be extended with a .PDF tree search engine, either a neural network or machine learning-based reasoning module and the capability of phylogenetic network digitalisation.</p>
<p style="text-align: justify;">Rapid phylogenetic tree digitalisation with the usage of PHYLODIGM versions for PCs and mobile devices would serve the process of knowledge discovery from archival documents containing printed or hand-drawn images of phylogenetic trees. Libraries storing documents on phylogenetics in any language could qualify as data source. This could lead to creation of a database of global usage and respect, thus spurring the development of phylogeny analysis in bioinformatics and other sciences<a href="#_msocom_4">[k4]</a> .</p>
<p><strong>References</strong></p>
<p><strong> </strong></p>
<p>1. Hughes, J.; <em>TreeRipper: towards a fully automated optical tree recognition software</em>; Nature Preceedings; Nature Publishing Group; 2010</p>
<p>2. Huson, D.; Richter, D.; Rausch, C.; Dezulian, T.; Franz, M. &amp; Rupp, R.; <em>Dendroscope: An interactive viewer for large phylogenetic trees</em>; Bmc Bioinformatics 8/6; BioMed Central Ltd; 2007</p>
<p>3. Laubach, T. &amp; Von Haeseler, A.; <em>TreeSnatcher: coding trees from images</em>; Bioinformatics 23/6; Oxford University Press; 2007</p>
<p><strong>Warsaw University of Technology, Faculty of Electronics and Information Technology, Institute of Radioelectronics, </strong><strong>Nuclear and Medical Electronics Division, Laboratory </strong><strong>of Detection and Spectrometry</strong></p>
<p><strong>Address:</strong> Nowowiejska 15/19, 00-665, Warsaw, Poland,</p>
<p><strong>E-mail:</strong> W.Januszewski@stud.elka.pw.edu.pl;</p>
<p>[starrater tpl=45]</p>
]]></content:encoded>
			<wfw:commentRss>http://en.bioinformatyk.eu/contest-articles/phylodigm-phylogenetic-tree-digitalisation-manager.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

