PAGE 1
ConvergenceofDomainArchitecture,Structure,andLigand AffinityinAnimalandPlantRNA-BindingProteinsRaquelDias,1AustinManny,2OraliaKolaczkowski,2andBryanKolaczkowski*,2,31DepartmentofBiologicalSciences,North ernArizonaUniversity,Flagstaff,AZ2DepartmentofMicrobiology&CellScience,InstituteofFoodandA griculturalSciences,UniversityofFlorida,Gainesville,FL3GeneticsInstitute,Universi tyofFlorida,Gainesville,FL Correspondingauthor :E-mail:bryank@u.edu. Associateeditor: ClausWilkeAbstractReconstructionofancestralproteinsequencesusingphylogen eticmethodsisapowerfultechniquefordirectlyexamining theevolutionofmolecularfunction.Althoughancestralseq uencereconstruction(ASR)isitselfveryefcient,downstreamfunctional,andstructuralstudiesnecessaryto characterizewhenandhowch angesinmolecularfunction occurredareoftencostlyandtime-consu ming,currentlylimitingASRstudies toexaminingarelativelysmallnumber ofdiscretefunctionalshifts.Asaresult,wehaveverylittl edirectinformationabouthowmolecularfunctionevolves acrosslargeproteinfamilies.Herewedevelopanapproach combiningASRwithstructureandfunctionpredictionto efcientlyexaminetheevolutionofligandafnityacrossalargefamilyofdouble-strandedRNAbindingproteins(DRBs) spanninganimalsandplants.WendthatthecharacteristicdomainarchitectureofDRBsconsistingof23tandem double-strandedRNAbindingmotifs(dsrms)aroseindependentlyinearlyanimalandplantlineages.Theafnitywith whichindividualdsrmsbinddouble-strandedRNAappearst ohaveincreasedanddecreasedoftenacrossbothanimal andplantphylogenies,primarilythroughconvergentstructu ralmechanismsinvolvingRNA-contactresidueswithintheb1b2loopandasmallregionof a 2.Thesestudiesprovidesomeoftherstdirectinformationabouthowprotein functionevolvesacrosslargegenefamiliesandsuggest thatchangesinmolecularfunctionmayoccuroftenandunassociatedwithmajorphylogeneticevents,suchasgeneordomainduplications. Keywords: ancestralsequencereconstruction,double-strandedRNAbindingproteins,proteinfamilyevolution, molecularfunctionalevolution,RNAinterference.IntroductionUnderstandinghowproteinsevolvenovelfunctionalrepertoiresremainsanimportantgoalofmolecularandevolutionarybiology( WhelanandGoldman2001 ; Kingetal.2003 ; OrengoandThornton2005 ).Emergingtechniquescombiningancestralsequencereconstruction(ASR)withlaboratory functionalassaysandstructu redeterminationhaveallowed researcherstometiculouslycharacterizetheevolutionaryand structuralbasesforchangesinmolecularfunction( Malcolm etal.1990 ; Shihetal.1993 ; Ugaldeetal.2004 ; Bridghametal. 2006 2009 ; ZmasekandGodzik2011 ; Voordeckersetal.2012 ; vanHazeletal.2013 ; OgawaandShirai2014 ; Whitfieldetal. 2015 ; CliftonandJackson2016 ).Whiletheseapproachesprovideunprecedentedopportuniti estorigorouslyinvestigate themolecular-functionalevolutionofproteinfamilies( Shih etal.1993 ; Hanson-Smithetal.2010 ; HarmsandThornton 2010 ; MerklandSterner2016 ),theirrelianceondetailedexperimentalmethodslimitsthescaleatwhichancestralproteinresurrectioncanbeapplied. Severalmechanismscancontributetothegenerationof newproteinfunctions( Chenetal.2013 ),includinggeneduplication,fission,orfusion( Songetal.1987 ; Wangetal.2004 ), retrotransposition( CordauxandBatzer2009 ), denovo gene origination( Caietal.2008 ),lateraltransfer( DunningHotopp etal.2007 ),shiftsinagene'sreading-frame( Ohno1984 )and domainshuffling( PaoandSaier1995 ).Theimportanceof geneduplicationforgeneratingmolecular-functionalnovelty acrossproteinfamiliesisinlittledoubt( Sahaetal.2006 ),even iftheparticularmechanismsbywhichduplicationallowsfor functionalevolutionmaybemultifaceted( Rastogiand Liberles2005 ; Bridghametal.2008 ).Asidefromgenedosage effects( Veitiaetal.2013 )andpost-duplicationchangesin generegulation( NguyenBaetal.2014 ),retentionofduplicate genesoverlongperiodsoftimeisgenerallyconsideredto requiresignificantalterationofatleastoneduplicateprotein's molecularfunction( Hughes1994 ; Zhang2003 ).PostduplicationchangesinproteinfunctionhavebeenobservedinmanyASRstudies( TiroshandBarkai2007 ; Zhangetal. 2009 ; Kuraku2013 ).Althoughthesefindingscanbetakenas evidencethatgeneduplicationmaycorrelatewithfunctional evolution( TaylorandRaes2004 ; ConantandWolfe2008 ; Kassahnetal.2009 ),lessefforthasbeeninvestedinlooking forfunctionalevolutionnotassociatedwithgeneduplicationsinlargeproteinfamilies( Bridghametal.2008 ; Hobbs etal.2012 ; Bridghametal.2014 ).Thelowthroughputof traditionalASRapproaches,coupledwithanhistoricalfocus Article TheAuthor2017.PublishedbyOxfordUniversityPressonbe halfoftheSocietyforMolecularBiologyandEvolution. ThisisanOpenAccessarticledistributedun derthetermsoftheCreativeCommonsAttrib utionLicense(http://creativecommons. org/licenses/by/4.0/),whichpermitsunrest rictedreuse,distribution,andreproduction inanymedium,providedtheoriginalworkis properlycited.OpenAccessMol.Biol.Evol. doi:10.1093/molbev/msx090AdvanceA ccesspublicationFebruary25,20171
PAGE 2
ongeneduplications,meanswehaveverylittleunbiased informationabouthowmolecularfunctionevolvesinlarge proteinfamilies,particularlyacrossdeepphylogenetichistory. Herewedevelopanapproachthatcombineslarge-scale ancestralsequencereconstru ctionwithmoleculardynamics andstructure-basedaffinitypredictiontocharacterizethe evolutionofmolecularfunctionacrossalargefamilyof double-strandedRNAbindingproteins(DRBs).DRBscoordinatethefirststepsoftheRNAinterference(RNAi)process, workingwithDicertoselectdsRNAtargetsandgenerate RNAfragmentsforloadingontotheRNA-inducedsilencing complex( Chendrimadaetal.2005 ; Liuetal.2006 ; Koketal. 2007 ; Curtinetal.2008 ; Ceniketal.2011 ; Fukunagaetal. 2012 ).VertebrateDRBshaveadditionallybeenshownto regulatecellularstressrespon sesthroughinteractionswith ProteinKinaseR( Daheretal.2009 ; Dickermanetal.2015 ). DRBsconsistof23double-strandedRNA-bindingmotifs (dsrms),shortfunctionaldomainsthateitherbinddoublestrandedRNAsorfacilitateprotein-proteininteractions(see supplementaryfig.S1 SupplementaryMaterial online) ( Kuriharaetal.2006 ; Larakietal.2008 ; Yangetal.2010 ; Wilsonetal.2015 ).AlthoughDRBfunctionhasbeenexaminedinahandfulofmodelanimalsandplants,verylittleis knownaboutDRBevolutionaryhistoryorabouthowthe functionaldiversityofDRBdsrmsevolved( Claveletal.2016 ).ResultsandDiscussionDRBProteinFamiliesDiversifiedIndependentlyin AnimalsandPlantsTobeginexaminingthemolecular-functionalevolutionof double-strandedRNA-bindingproteins(DRBs),weidentified anyproteinsequencefromNCBI'sNRdatabaseencoding23 double-strandedRNA-bindingmotifs(dsrms)andnoother annotatedfunctionaldomains,consistentwiththecharacteristicdomainarchitectureofDRBsfromwell-studiedmodel organisms(ang,etal.2010; Wilsonetal.2015 ).Toconstructa reliableconsensusphylogeny,wealignedfull-lengthDRBsequencesandindividualfunctionaldomainsusingavarietyof approaches,inferredmaximum -likelihoodphylogeniesfrom eachalignmentandcombinedresultsusingbothsupermatrix andsupertreeapproaches(seeMaterialsandMethodsfor details). Astronglysupportedconsensusphylogenyacrossallalignmentmethodsandtree-reconstructionapproachessuggests thatDRBproteinfamiliesdiversifiedindependentlyearlyin animalandplantlineages( fig.1 ; supplementaryfilefull_trees. nexus.txt containsalltrees,andFilesDRB_full_idmap.txtand dsrm_full_idmap.txtcontain Genbankaccessionnumbersfor allsequences, SupplementaryMaterial online).AllplantDRBs weremonophyleticwith > 0.94SH-likeaLRT,whileanimal DRBsgroupedwithanimalStaufenproteins(support > 0.92). Withintheplantclade,thewell-studiedDRB1proteinfrom monocots,dicotsandbasalvascularplantsgroupedwitha recentlycharacterizedDRB6(support > 0.94),butDRB6has beenlostfromBrassicaceae( Claveletal.2016 ).PlantDRB4 groupedwithanunresolvedcladeofDRBsfromearlyvascular plantsaswellasDRB2/3/5sequencesfrommonocotsand dicots(support > 0.96),althoughtheDRB2/3/5cladedidnot fullyresolveintheconsensustree.Thatsequencesfrom mossesgrouptightlywithDRB1,DRB6andDRB2/3/5/4 cladessuggeststhatthesemajorgeneduplicationsoccurred earlyintheplantlineage,wit hlaterdivergenceofDRBs2,3, and5,possiblyinfloweringplants.Giventheconsensustree, thetimingofDRB4'soriginisunclear;itcouldhavediverged fromplantDRB2/3/5infloweringplantsorearlier. Withintheanimalclade,DRBsequencesfrombilateria separatedfromStaufenproteinsandDRB-likeproteins fromcnidariawith > 0.96SH-likeaLRT( fig.1 ).WhileDRBs fromarthropods(LOQS)andvertebrates(TARBP2,PRKRA) groupedwithlophotrochozoanandinvertebratedeuterostomeDRBs(support > 0.98),thenematodeDRB(RDE4) andoneofthearthropodDRBs(R2D2)werebasaltothe mainDRBclade(Gin fig.1 ).Thissuggeststhateitherthe ancestralDRBduplicatedearlyi nthebilaterianlineage,with arthropodsretainingtwoDRBgenes,nematodeslosingone, andtheremainingbilaterialosingtheother,orphylogenetic errorssuchaslong-branchattractionartifactuallyreshaped thebranchingpatternofearlyanimalDRBdivergenceinour analysis. Thegroupingoflong-branchedtaxaatthebaseofarelativelyshorter-branchedcladeisaclassicsignatureoflongbranchattraction( Felsenstein1978 ; Kucketal.2012 ). However,ourpreviousanalysisofDicerandArgonauteproteinfamiliesalsoparticipatinginRNAisuggestedthat thesegenesalsoduplicatedearlyinbilateria,withduplicates beinglostinnon-arthropods( Mukherjeeetal.2013 ).These resultsareconsistentwithamodelinwhichtheentireRNAi pathwaymayhavesharedanancientduplicationevent,followedbylineage-specificlo sses.Givencurrentresultsand sequencedata,wefeelthemostappropriateconclusionis toremainagnosticastotheprecisepatternofDRBduplicationsintheanimallineage,althoughtheearlydivergenceof bilaterianDRBsfromStaufensappearswell-supported,as doesalaterDRBduplicationinthevertebratelineage(support > 0.86; fig.1 ). Althoughphylogeneticcertaintyisimpossibletocompletelyensure,andsystematicartifactscangeneratestrongly supportederrorsinsomecases,thatthesamegeneraltree topologyisrecoveredusingdifferentsequencealignments, alignmentprocessing,andtreeinferencestrategiessuggests ourconsensusphylogenyislargelyrobusttomanyofthe majorsourcesofphylogeneticuncertaintyandbias( ZwicklandHillis2002 ; OgdenandRosenberg2006 ).Whileadditional sequencedataandmajoradvancementsinphylogenetic methodsmayreviseourconclusionsinthefuture,wefeel ourconsensustreerepresentsareasonableinferenceofDRB evolutionaryhistory,givencurrentdata,andmethodology.DRB'sTandem-dsrmDomainArchitectureArose IndependentlyinAnimalsandPlantsAnimalandplantDRBshaveafai rlyconsistentdomainarchitecture;allwell-studiedplantDRBsencodetwodoublestrandedRNA-bindingmotifs(dsrms),whereasanimal DRBsencode23dsrms( Yangetal.2010 ; Wilsonetal. 2015 ).Nomajorvariationsonthis23dsrmdomain Diasetal. doi:10.1093/molbev/msx090MBE2
PAGE 3
architecturehavebeenobserved,withtherecentexceptionof apossiblesingle-dsrmproteinfromplants( Claveletal.2016 ). TocharacterizewhenandhowtheDRBdomainarchitecture evolved,weidentifiedalldsrmproteinsequencesfromthe NCBIRefSeqdatabaseandclustereddsrmproteinsby sequence-similarityandphylo geneticanalysestoidentify thosemostcloselyrelatedtodsrmsfromDRBs(see MaterialsandMethods, supplementarytextS1,tablesS1S3, andfig.S2 SupplementaryMaterials online).Tomitigate potentialphylogeneticerrorswhenexaminingtheevolutionaryhistoryofshortfunctionaldomainsoverlongtimescales,weusedastructuralalignmentofavailabledsrm structuresandsimilarfoldstoaligndsrm-relatedprotein sequencesforreconstructingthemaximum-likelihooddomainphylogeny(seeMaterialsandMethods). WefoundthatallanimaldsrmsfromDRBproteinswere monophyletic(SH-likeaLRT 0.98),allplantdsrmswere monophyletic(support 0.99),anddsrmsfromanimaland plantDRBswereseparatedfromdsrmsfromotherproteins withmaximalsupport( fig.2 supplementaryfilefull_trees. nexus.txt ).Evengiventheshortdsrmsequences,individual dsrmcladeswerefairlywell-supportedwithinanimaland plantlineages.Thesecondplantdsrm(dsrm2)wasmonophyleticwithSH-likeaLRT 0.96.Animaldsrm1anddsrm3 wereeachmonophyleticwithsupport 0.85and0.99,respectively.Asidefromdsrm2fromarthropodR2D2,animal dsrm2domainsgroupedtogetherwith0.95support,butthe branchingorderofanimalDRBdsrm2sandStaufendsrms wasunresolved.Plantdsrm1sequencesdidnotforma monophyleticcladewithstrongsupportintheconsensus phylogeny,butdsrm1sequencesfromdifferentplantDRBs didformrespectivemonophyleticgroups(support > 0.91). Theseresultsarelargelyconsistentwithrecentphylogenetic analysesofplantDRBanddsrmsequences( Claveletal.2016 ). Together,ourresultssupportamodelinwhichasingle ancestraldsrmdomainduplicatedindependentlyinanimal andplantlineages,suggestingthatthe23dsrmdomain architectureofanimalandplantDRBsisacaseofconvergent evolution.Althoughwefeelthestructuralalignmentisprobablymoreaccuratethansequence-basedalignmentsinthis case,similarresultswereobtainedusingthreedifferentsequencealignmentstrategies,indi catingtheseresultsaregenerallyrobusttoalignmentambiguity( supplementaryfigs.S3 S5 SupplementaryMaterial online).Althoughsupportforthe monophyleticgroupingsofdsrm1,dsrm2,anddsrm3domainswasnotalwayshigh,phylogeneticinferencesdonot appeartobestronglyaffectedbylong-branchattractionor otherbiases,asmajortaxonomicgroupingstendtofollow currentspeciestreeestimates.Theseresultsgenerallyargue againstwidespreaddomain-shufflingorothercomplexevolutionaryscenariosshapinganimalorplantDRBs. Alternatively,thecanonicaldomainarchitecturecould haveevolvedbeforetheanimalplantsplit,andpartialgeneconversioneventsorphylogeneticartifactsmayberesponsiblefortheapparentrespectivemonophylyofanimal andplantdsrms.Wedidnotobservestrongevidencefor widespreadgeneconversionamongextantDRBs( supplementarytableS4 SupplementaryMaterial online).After 0.5Substitutions/ Site Vertebrates Arthropods Nematodes Lophotrochozoa Fungi Eudicots Monocots Protozoa Animal Staufen Bacteria/Protozoa Arthropod LOQS Vertebrate PRKRA Vertebrate TARBP2 Arthropod R2D2 Nematode RDE4 PlantDRB6 Plant DRB1 Plant DRB4 Plant DRB2 PlantDRB3/5 A B C D E F G H Bacteria/Protozoa: 1.00,0.90, 0.86 PlantDRBsA: 0.99,0.94,0.96 B: 0.99,0.94,0.99 C: 0.96,0.99,0.99 D: 1.00,0.98, 0.66 PlantDRB6: 0.95, 0.86 ,1.00 PlantDRB1: 0.92,0.93,0.94 PlantDRB4: 0.99,0.90,1.00 PlantDRB2: 1.00, 0.53 ,1.00 PlantDRB3/5: 0.99,0.90,0.99 AnimalDRBsE: 0.97,0.92,1.00 F: 0.97,0.96,0.99 G: 1.00,0.98,1.00 H: 0.99, 0.86 ,0.99 AnimalStaufen: 1.00, 0.75 ,1.00 NematodeRDE4: 0.99, 0.83, 0.16 ArthropodR2D2: 0.85 ,0.99, 0.89 ArthropodLOQS: 1.00,0.91,1.00 VertebratePRKRA: 1.00,1.00,0.99 VertebrateTARBP2: 0.92,0.97,1.00 FungiSupermatrix Average SupertreeCladeSupport FIG.1. Double-strandedRNA-bindingproteins(DRBs)diversiedindependentlyinanimalsandplants.Wereconstructedmaximum-likelihood phylogeniesofallidentiableDRBproteinsequencesusingavarietyofalignmentstrategiesandtreereconstructionapproaches(seeMaterialsand Methods).Weshowaconsensustreeacrossallreconstructions.Branchlengthsarescaledtotheaveragenumberofsubstitutions/site,andmajor taxonomicgroupsareindicatedbybranchcolor.SH-likeaLRTsupportformajorcladesisindicatedinthetableforthesupermatrixtree reconstruction,theaveragesupportoverallindividualalignmentsandthesupertreeapproach(seeMaterialsandMethods);supportvalues < 0.9arered,andvalues < 0.8arebold.Nodesontheconsensustreearecollapsediftheyhad < 0.8supportfromallthreemethods. EvolutionofRNA-BindingProteins doi:10.1093/molbev/msx090MBE3
PAGE 4
removingannotatedisoforms,weidentified91pairsofDRB sequences(outof1793sequences)thatshowedsignificant supportforpossiblegene-conversioneventsinatleastone region(5%ofsequencesat P < 0.05).Nearlyallofthesepossiblegene-conversionevents(83)wereamongcloselyrelated mammalTARBP2sequences,withonlythreeamongmammalPRKRA,twoamongarthropodDRBs,andthreeamong plantDRBs.Theseresultsargueagainstwidespreadgeneconversionaffectingthemajorbranchingpatternofthedsrm phylogeny,althoughitmayimpactthebranchingpattern withinmammalianTARBP2sequences. ThefindingthatanimalandplantdsrmdomainsduplicatedtoproduceDRBdomainarchitecturesindependentlyin theselineagessuggestsourinitialapproachaligningfulllengthanimalandplantDRBscouldhaveintroducedpotentialphylogeneticartifacts( fig.1 ).Toaddressthis,weinferred separatemaximum-likelihoodphylogeniesoffull-lengthanimalandplantDRBs,usingrespectivedsrmoutgroup informationconsistentwiththehypothesisthatanimaland plantDRBdomainarchitectureswereindependentlyderived ( supplementaryfig.S6 SupplementaryMaterial online). TheseindividualanimalandplantDRBtreeswereconsistent withthemajorcladesidentifiedinourinitialanalysisoffulllengthDRBsequences(see fig.1 supplementaryfig.S6 SupplementaryMaterial online),suggestingourconsensus DRBphylogenyisrobust.HighAffinityforRNAAroseIndependentlyinAnimal andPlantdsrmsDRBdsrmsfrommodelorganismshavebeenobservedto playtwodifferentfunctionalroles:theybinddoublestrandedRNAmoleculesand/orfacilitateproteinprotein interactions,primarilywithDicer,mammalianPKRorby formingdimers( Kuriharaetal.2006 ; Larakietal.2008 ; Yangetal.2010 ; Wilsonetal.2015 ).Tobeginexamining howthisfunctionaldiversity evolved,wereconstructed otherdsrms DRB6 DRB2/3/5 DRB4dsrm1staufendsrms DRB6 DRB1 DRB4 DRB2/3/5dsrm2diploblast dsrms LOQS TARBP2 PRKRAdsrm3 dsrm1 dsrm2R2D2 LOQS TARBP2 PRKRA R2D2LOQS TARBP2PRKRA 0.4 substitutions/site predictedpKds MLsequence sampledsequences empiricalpKd MLsequence D D D D D D D D D D D D 4 4 4 4 4 DRB DRB DRB DRB DRB DRB DRB DRB DRB DRB DRB DRB RB DRB DRB R RB DRB DR R R DRB DR R DRB 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 DRB DRB DRB DRB DRB DRB DRB DRB DRB DRB DRB DRB DRB DRB DR DRB DRB DR RB DRB DR DRB RB DRB D B R 2/3 2/3 2/3 2/3 2/3 2/3 2/ 2/3 2/3 2/3 2/3 2/3 2/3 2/3 2/ 2/3 3 2/3 2 2/ 3 / 2/3 /3 /5 /5 /5 /5 /5 /5 /5 /5 /5 /5 /5 /5 5 /5 / /5 5 /5 /5 5 5 5 ds ds ds d ds ds ds ds ds ds ds ds s s ds ds ds s ds s rm rm rm rm rm rm m rm rm rm rm m rm m rm m r 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 D DR DR D DR DR D D D D D DR D D D DR D R D D RB RB RB RB RB RB RB RB R R RB RB RB RB B RB R R RB B B RB RB B R R RB RB RB RB RB RB RB RB RB B RB RB RB RB R B RB RB B B 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 B B B B B B B B B B B B B B B B B B B B B B B B B 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 DR DR DR DR D D D DR DR DR DR D DR D D D D R DR R D DR D D DRB DRB DRB DR DRB D DRB DRB DRB DRB DRB DRB DRB DRB RB DRB DRB DRB DRB DRB R DRB 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 ds ds ds ds ds ds ds ds ds ds ds ds ds ds s ds d ds ds s ds s ds ds s s s ds s s rm rm rm rm rm rm rm r rm rm rm rm rm rm rm rm rm m m m rm rm r m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 RB RB RB RB RB RB RB RB RB RB RB RB RB RB R RB RB RB RB B R RB R B RB B B B RB B B B B 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 DR D DR DR D D DR DR DR DR DR DR DR DR DR D DR R DR DR DR D DR DR D D R R R D D D D D D D D e e e e e e e e e e r r r r r r r r r r r r r r dsr ds dsr dsr dsr dsr dsr dsr dsr dsr d dsr dsr ds dsr dsr sr dsr d d ds ds s ds sr ms ms ms ms ms ms ms ms ms ms ms ms ms ms ms m ms ms m m s m DR DR DR D D DR DR DR DR D DRB DR R DR D D D D R DR D 6 6 6 6 6 6 6 6 6 6 6 6 B B B B B B B B B B B B B p p p p p p p dsr dsr dsr dsr dsr dsr dsr dsr dsr dsr dsr dsr dsr dsr dsr s r d s d s s d r m m m m m m m m m m m m m m m m m m m dip dip dip dip dip dip dip dip dip dip dip di dip dip dip dip dip dip p p dip i dip dip d p dip dip d d d p d lob lob lob l lob lob lob lob lob lob ob lob ob ob ob lob ob ob ob ob ob ob ob ob ob o o o o o b las las las las las las las las las la s las a las as as las as as as las as as as s s as las a a t t t t t t t t t t t t t t t t t t t PRK PRK PRK PRK PRK PRK PRK PRK PR PR RK PRK RK PRK PRK PRK PRK K P PR PRK R R PRK PRK PRK PRK PRK R RA RA RA RA RA RA RA RA RA RA RA RA A R RA A RA A A R RA RA RA RA R R A T T T T T T T T T ARB ARB A ARB ARB ARB AR ARB AR RB ARB ARB A A R RB ARB ARB P2 P2 P P2 P2 P2 P2 P P2 P2 2 2 2 2 2 2 P2 T T T T T T T T T T T T T T T T T T T T ds ds ds ds ds ds ds ds d ds d ds s d d d ds ds ds ds ds ds ds ds ds ds s ds s s s s s s s rm rm rm rm rm rm rm rm rm m m m rm r rm rm m m m 1 1 1 1 1 1 1 1 1 1 1 d ds ds ds ds ds ds ds ds d s s S S S S S S S S S S B RB RB B RB RB RB RB RB B B B RB RB B B RB B B B B P2 P2 P2 P2 P2 P2 P2 P2 P2 P2 P2 P2 P2 P2 P2 2 P2 P 2 P 2 P 2 PRK PRK PRK PRK PRK PRK PRK PRK PR PRK PRK PRK PRK PRK K PRK P R PRK PRK R K P K K K RA RA RA RA RA RA RA RA RA RA RA RA RA RA RA RA A R A R A A RA R2D R2D R2D R2D R2D R2D R2 R2D R2D R2D R2D R2D R2D R2D R2D R2D R 2 R2D 2 R2D D R2D 2 2 2 2 T T T T T T T T T T T T T T T T ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB ARB R A RB B P P P P P P P P P P P P P P P P P P T T T T T T T T T T T T T T T 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 R2D R2D R2D R2D R2D R2D R2D R2D R2D R2D R2D R2D 2D R2 D 2D R2D R2D R2 D R2D R2 2 R2D 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 LOQ LOQ LOQ LOQ LOQ LOQ O LOQ OQ LOQ OQ LOQ LOQ LOQ LOQ OQ OQ OQ OQ OQ O L Q S S S S S S S S S S S S S S S S S S S S S S T T T T T T T T T T ARB ARB ARB ARB ARB ARB ARB ARB AR AR ARB ARB ARB AR B RB AR AR ARB B RB AR R P P P T T T T T T T T T T T T T T T T T T P P P P P P P P P P P P ancPlantDRB6dsrm2 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf 3456 BindingAf *3456 BindingAf *3456 BindingAf 3456 BindingAf ancPlantDRB2/3/5dsrm2ancPlantDRB1dsrm2ancPlantdsrm2 ancPlantDRB1dsrm1 ancPlantdsrm ancAnimaldsrm3 ancAnimaldsrm ancAnimaldsrm1 ancAnimaldsrm2 FIG.2. Themultiple-dsrmdomainarchitectureofanimalandplantDRBsevolvedindependently,anddsrmRNAafnitiesdiversiedearly.We reconstructedthemaximum-likelihooddomainphylogenyofdsrmfunctionaldomainsfromanimalandplantDRBgenes,rootedusingdsrm domainsfromothergenesandalignedbystructure(seeMaterialsandMethods).Weplotaconsensustreeinwhichnodeswith < 0.8SH-likeaLRT arecollapsedtopolytomies.Branchlengthsarescaledtosubstitutions/site.Ancestralsequenceswerereconstructedatkeynodesonthe phylogeny(browncircles),andweinferredthestructuresofancestraldsrmproteinsequencesboundtodsRNAbyhomologymodelingand moleculardynamics;inferreddsrmRNAcomplexeswereusedtopredictRNAbindingafnities(seeMaterialsandMethods).Weplotthe predicteddsrmRNAafnities(p Kds)ofeachancestralsequence,inferredusingmaximum-likelihood(darkgraybars)orbysamplingfromthe ancestralstateposteriordistribution(mediumgraybars).LightgraybarsindicateexperimentallydetermineddsrmRNAafnities,withstandard errorsshown(seeMaterialsandMethodsforancestralreconstructionandexperimentaldetails).Redtrianglesindicatesignicantincreasesin dsrmRNAafnities,andbluearrowsindicatesignicantdecreases,basedonexperimentallydeterminedafnityvalues( P < 0.05).Ancestral nodesforwhichmaximum-likelihoodandsampledancestralsequenceshadsignicantlydifferentpredictedafnitiesareindicatedbyredstars ( P < 0.05). Diasetal. doi:10.1093/molbev/msx090MBE4
PAGE 5
ancestralproteinsequencesatearlykeynodesintheanimal andplantdsrmphylogeny,infe rredstructuralcomplexeswith dsRNAbyhomologymodeling,energy-optimizedthesemodelsbymoleculardynamicsandpredicteddsrmRNAaffinities [p Kd log10( Kd)]usingapreviouslydevelopedstatistical machinelearningapproach(seeMaterialsandMethods). Althoughmaximum-likelihoodancestralsequencereconstruction(ASR)istypicallyconsideredrobust( Hanson-Smith etal.2010 ),someconcernshavebeenraisedthatchoosing themaximum-likelihoodstateateverypositionintheancestralsequencecouldintroducefunctionalartifactsinsome cases,particularlywhenproteinstabilityisanimportantcomponentofmolecularfunction( Williamsetal.2006 ).Toaddressthisconcern,someresearc hershavesuggestedsampling alargenumberofpossibleancestralsequencesfromtheposteriordistributionateachsite( PollockandChang2007 ),but tothebestofourknowledge,thisapproachhasneverbeen usedinpractice,duetothecostofexperimentallyexamining thefunctionsoflargenumbersofancestralsequences. Asaffinitypredictionapproachesdonotsufferfromthe sameefficiencylimitationsaslaboratoryanalyses,weexaminedtherobustnessofaffinityestimatestoASRambiguityby reconstructingmultiple"randomdraws"fromeachancestral sequence'sposteriordistributionandcomparingp Kdestimatesacrosstheseposterior-drawsequencestothep Kdof themaximum-likelihoodancestralsequence,averagedover multiplestructuralreplicates(seeMaterialsandMethods). NodesforwhichthepredictedRNAaffinityofthe maximum-likelihoodancestralsequencewasnotsignificantly differentfromthedistributionofRNAaffinitiesoverrandom drawswereconsideredrobusttoancestralsequenceuncertainty;wethenexpressedthemaximum-likelihoodprotein andmeasureditsaffinityforshortdsRNAexperimentally(see MaterialsandMethods). WefoundthatthepredictedRNAaffinitiesof4/5ofthe earlyanimalancestraldsrmswererobusttouncertaintyinthe ancestralsequencereconstruction(at P > 0.05),whereasonly 6/10ancestralplantdsrmswererobusttoASRuncertainty ( fig.2 ).ForthecasesinwhichpredictedRNAaffinitieswere unaffectedbyancestralsequenceuncertainty,experimental affinityestimatesweregenerallyconsistentwithmaximumlikelihoodp Kdestimates( fig.2 ).Weobservedatmosta3.6folddifferencebetweenexperimentalandpredictedRNAaffinity.Onlytwonodeshad > 3-folddifferencesbetweenexperimentalandpredictedaffinities(ancAnimaldsrm2and ancAnimaldsrm1),andonlyfouradditionalnodeshad > 2foldaffinitydifferences(ancAnimaldsrm,ancAnimaldsrm3, ancPlantDRB2/3/5dsrm2,andancPlantDRB6dsrm2). As figure2 shows,bothanimalandplantancestraldsrms hadrelativelylowaffinityfordsRNA(experimentallydetermined Kd> 17lM, Km> 16lM;see supplementaryfig.S7 SupplementaryMaterial online)andwerestatisticallyindistinguishablefromoneanother( P > 0.34).AncestrallowaffinityforRNAwasretainedinancAnimaldsrm2 ( Kd 24.6lM, Km 22.9lM; P > 0.27)andatleastoneof theancestralplantdsrm1lineages(ancPlantDRB4dsrm1; Kd 38.9lM, Km 38.3lM; P > 0.29).Highaffinityfor dsRNA( 10-foldincrease)evolvedatleastonceinplants, alongthebranchleadingtoancPlantdsrm2( Kd 5.2lM, Km 4.2lM; P < 9.75e 4)andatleasttwiceinanimals,independentlyalongbranchesleadingtoancAnimaldsrm3 ( Kd 3.2lM, Km 4.1lM; P < 4.44e 3)andancAnimal dsrm1( Kd 4.0lM, Km 4.2lM; P < 1.42e 2).Finally, ancPlantDRB6dsrm2re-evolvedlowaffinityfordsRNAafter itdivergedfromancPlantdsrm2(ancPlantDRB6dsrm2 Kd 24.4lM, Km 24.8lM; P < 1.17e 2). Thedsrmstructuralfoldishighlyconservedacrossanimals andplants,andstructuralstudiesofdsrmRNAinteractions haveindicatedthatdsrmsformstabilizinginteractionswith RNAthroughtwoprimaryinterfaces,aloopbetweenb1andb2,whichinsertsacanonicalhistidineintotheRNAminor groove,andaclusterofbasicresiduesatthestartof a 1,which appeartostabilizetheRNAbackbone( RyterandSchultz 1998 ; Yangetal.2010 ). Consistentwiththismodel,wefoundthatspecifichistoricalsubstitutionsintheb1b2loopandthe a 1regionwere responsibleforobservedchang esindsrmRNAaffinitiesin animalsandplants( fig.3 supplementaryfigs.S8andS9 SupplementaryMaterial online).Theancestralanimaldsrm lackedthecanonicalb1b2histidine,hadapolarbutnot basic a 1regionandbounddsRNAwith Kd 17.17lM. AlongthebranchleadingtoancAnimaldsrm3,Q31H,and D STA52RSKKsubstitutionsoccurred,whichwerecollectively sufficienttoincreasedsRNAaffinity4.3-foldintheancAnimal dsrmbackground( P 0.011),makingitsRNAaffinityindistinguishablefromthatofancAnimaldsrm3( P 0.46). IndependentQ31Hand D STA52 D SKKsubstitutionsalong thebranchleadingtoancAnimaldsrm1weresufficientto increasedsRNAaffinity3-fold( P 0.013),whichwasalso statisticallyindistinguishablefromthefullancAnimaldsrm1 sequence( P 0.11).Theseresultssuggestthatboththeancestralanimaldsrm1anddsrm3evolvedhighdsRNAaffinity fromalow-affinityancestorthroughsimilarstructural mechanisms. PhylogeneticanalysissuggeststhattheevolutionofhighaffinitydsrmRNAinteractionsinanimalDRBsoccurred throughconvergentmechanisms,withtheH31substitution arisingindependentlyinancAnimaldsrm1anddsrm3aswell asalongthedsrm2lineage(see fig.3 supplementaryfig.S8 SupplementaryMaterial online).AlthoughthealternativehypothesisthatH31aroseinthecommonancestorofanimal dsrmsismoreparsimoniousthanthreeindependentsubstitutions,residuesflankingH31aredifferentinancestralanimal dsrm1anddsrm3aswellashumanTARBP2dsrm2,suggestingthatthisregioncanbehighlyvariable( supplementaryfig. S8 SupplementaryMaterial online).Ancestralresiduesatthis positionwerereconstructedwithhighconfidence,arguing againstreconstructionuncertaintyasamajorexplanation forthisresult( supplementaryfig.S10 Supplementary Material online).Similarly,theKK54substitutionappearsto haveoccurredindependentlyinanimaldsrm1,dsrm3and dsrm2lineages,withsimilarvariationsinflankingresidues andverylittleuncertaintyinancestralsequences( supplementaryfigs.S8andS10 SupplementaryMaterial online). Individualanimaldsrm1,dsrm2,anddsrm3cladeswere stronglysupportedphylogeneticallyusingavarietyof EvolutionofRNA-BindingProteins doi:10.1093/molbev/msx090MBE5
PAGE 6
alignmentsandinferencestrategies,arguingagainstphylogeneticerrorastheprimaryexplanationfortheseresults (see fig.2 supplementaryfigs.S3S5 Supplementary Material online).Althoughevolutionaryhistorycannever beinferredwithabsolutecertainty,wehavenotobserved anystrongevidenceforsystematicerrorsinthiscase. AlthoughtheancestralplantdsrmhadthecanonicalhighaffinityH31residue( fig.3 supplementaryfig.S8 SupplementaryMaterial online),itsSTRL53 a 2regionwasapparentlynotcapableofconferringhighdsRNAaffinity ( Kd 59.2lM).IntroducingthederivedancPlantdsrm2 a 2 region(KNKK53)intotheancestralplantdsrmbackground wassufficienttoincreasedsRNAaffinity9.1-fold( P 0.021), whichwassimilartotheaffinityofancPlantdsrm2( P 0.11). FollowingtheevolutionofhighRNAaffinityinancPlantdsrm2, anH31LsubstitutionalongthebranchleadingtoancPlant DRB6dsrm2re-evolvedlowRNAaffinity(6.6-foldchangein Kd; P 0.031).Together,theseresultssuggestthatconcerted amino-acidsubstitutionsinthedsrmb1b2loopand a 1regionwereresponsibleforrepeatedgainsandlossesofdsRNA affinityduringtheearlyevolutionofanimalandplantDRBs. Althoughmostofthecriticalresiduesinancestralb1b2 loopand a 1regionswerereconstructedwithhighconfidence, somecriticalresidueshadlowerconfidence( < 0.9posterior probability),andinsomecases,alternativereconstructions with > 0.1probabilitywereidentified( supplementaryfig.S10 SupplementaryMaterial online).Mostalternativereconstructionswerewithinthesamebiochemicalclassasthe maximum-likelihoodresidue,andintroducingallalternative keyresiduesintotherespectivemaximum-likelihoodsequencesdidnotchangeexpe rimentallydeterminedRNA affinities( P > 0.22).TheseresultssuggestthatRNAaffinity measurementsarelikelyrobusttoancestralsequenceambiguityatkeyresidues(seealso fig.2 ). Together,ourresultssuggestthatthecanonicaltandemdsrmarchitectureofanimalandplantDRBproteinswas piecedtogetherindependentlyinearlyanimalandplantlineagesfromanancestraldsrmthathadrelativelylowaffinity fordouble-strandedRNA.Followingearlydsrm-domain duplications,independentbutsimilarsubstitutionsintheb1b2loopand a 1regionofanimal(dsrm1,dsrm3)andplant (dsrm2)dsrmsproduceddomainswithhigherRNAaffinity. ancAnimaldsrmQ31 STA52humanTARBP2dsrm2H31 TSKK52ancPlantdsrmH31 STLR53ancPlantdsrm2H31 KNKK53A.thaliana DRB1dsrm1H31 FNRK53ancPlantDRB6dsrm2L31 RNKK53ancAnimaldsrm3H31 RSKK52 ancAnimaldsrm3 Q31H, STA52RSKK ancAnimaldsrm 3456 ancAnimaldsrm1H31 SKK52 ancAnimaldsrm1 Q31H, STA52 SKK ancAnimaldsrm 3456 ancPlantdsrm2 STLR53KNKK ancPlantdsrm 3456 ancPlantDRB6dsrm2 H31L ancPlantdsrm2 3456 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 FIG.3. ObservedshiftsinearlyanimalandplantdsrmRNAafnitiesareexplainedbysubstitutionsinthe b 1 b 2loopandthe a 2region.We reconstructedancestralanimalandplantdsrmproteinsequencesbeforeandaftermajorshiftsindsrmRNAafnities(see g.2 )andpredictedthe dsrmRNAstructuralcomplexbyhomologymodelingandmoleculardynamics(seeMaterialsandMethods).HumanTARBP2dsrm2and A. thaliana DRB1dsrm1areshownforcomparison.Weintroducedhistoricalsubstitutionsoccurringalongthebranchspanningeachobserved functionalshiftandmeasureddsrmRNAafnitiesusingalabel-free invitro kineticsassay(seeMaterialsandMethods).Weplotthesteady-state dsrmRNAafnityofeachprotein(p Kd),withlongerbarsindicatinghigherafnity.Barsindicatestandarderrors.Kineticscurvesareshownin supplementarygureS9 SupplementaryMaterial online. Diasetal. doi:10.1093/molbev/msx090MBE6
PAGE 7
Althoughtheseresultsdemonstratequantitativechanges inanimportantcomponentofDRBmolecularfunction,the biologicalconsequencesofthesechangesindsrmRNAaffinityaredifficulttodetermine.IncreasesinRNAaffinityduringearlyanimaldsrmevolutionwererelativelysmall(4.3-to 5.4-fold),whereasthechangeinRNAaffinityalongtheplant dsrm2branchwasmoresubstantial(11.4-fold).Animaland plantDRBproteinscoordinatekeyaspectsoftheRNAinterferenceprocess,buthowchangesindsrmRNAaffinitymight impactRNAiisnotknown.RNAiplaysimportantrolesin animalandplantantiviralimmunitybydirectlytargetingviral RNA( Luetal.2005 ; Blevinsetal.2006 ; Zambonetal.2006 ; Segersetal.2007 ; Quetal.2008 ; Salehetal.2009 ; Umbach andCullen2009 ),suggestingthatevensmallchangesinRNA affinitycouldimpactantiviralRNAitargetingandtherefore haveapotentiallystrongeffectonorganismfitness.RNAialso playsimportantrolesinanimalandplantdevelopment ( Grishoketal.2001 ; Kettingetal.2001 ; KnightandBass 2001 ; Boucheetal.2006 ; KloostermanandPlasterk2006 ; Liuetal.2007 ; NagandJack2010 ; SayedandAbdellatif 2011 ; Duarteetal.2013 );changesinDRB-RNAaffinitycould thereforeimpactdevelopmentaltimingorprogression.DsrmRNAAffinityChangedOfteninAnimaland PlantDRBLineagesTothebestofourknowledge,allexistingancestralreconstructionstudieshaveidentifiedparticularnodesontheproteinfamilytreetoexaminebasedonphylogeneticpatterns and/orlimitedfunctionalanalysesofextantproteins. Althoughproductive,existingstudiesarelimitedtoexaminingasmallnumberofnodesonthetreeandcannottakea comprehensive,unbiasedviewofhowmolecularfunction mayhaveevolved.Asacomplementaryapproach,wereconstructedmaximum-likelihoodancestralsequencesatevery nodeonthedsrmphylogeny,builtstructuralmodelsof eachsequenceboundtodsRNA,optimizedproteinRNA interactionsbymoleculardynamicsandusedstatisticalmachinelearningtodirectlyinferaffinitiesfromtheresulting structuralcomplexes(seeMaterialsandMethods). Althoughcomputationalratherthanexperimentalthis approachprovidesadirectassessmentofproteinRNAaffinityacrosstheentireevolutionaryhistoryofDRBdsrmdomains,providingalargelyunbiasedviewofhowmolecular functionmayhaveevolvedacrossalargephylogeny. WefoundthatdsrmRNAaffinityappearstohave changedsignificantlyandoftenacrossanimalandplantlineages( fig.4 ).Thesmallestp Kdestimatewas3.33(equivalent to Kd 467.7lM),andthelargestwas6.53( Kd 0.295lM), withanaverageof4.79( Kd 16.2lM)andamedianof4.75. Kerneldensityestimationrevealedthattheoveralldistributionofp Kdestimateswasslightlyskewedtowardmarginally smallervalues(mode 4.65),withanoticeableexcessofestimateshavingp Kd> 5.5( supplementaryfig.S11 SupplementaryMaterial online).Webuiltstructuralmodels ofdsrmRNAcomplexesusinghumanTARBP2and Arabidopsisthaliana DRB1complexesastemplates(see MaterialsandMethods, supplementaryfig.S1 SupplementaryMaterial online).ThesedomainsbindRNA insimilarconformations( Yangetal.2010 ),andp Kdestimates usingeachstructuraltemplatewerehighlycorrelatedacross ancestralandextantdsrmsequences( supplementaryfig.S12 SupplementaryMaterial online).Plottingp Kdestimatesfrom eachtemplateonthedsrmphylogenyalsorevealedsimilar patternsofhigh-andlow-affinitydsrms( supplementaryfig. S13 SupplementaryMaterial online). DsrmRNAaffinitypredictionusedstructuralinformation aboutthedsrmRNAcomplex,whichweinferredbyhomologymodelingandmoleculardynamics(seeMaterialsand Methods).Anyerrorsinancestralsequencereconstruction thatimpactproteinfoldingorstabilitycouldthereforeimpact p Kdprediction.Previousstudi eshavefoundthatASRerrors areassociatedwithhighlevelsofambiguityinthereconstructedsequence( Hanson-Smithetal.2010 ).Ifp Kdpredictionswerestronglyaffectedbyerrororambiguityinthe ancestralsequence,wewouldthereforeexpectastrongcorrelationbetweenancestralsequenceambiguityandp Kdestimates.Wefoundnocorrelationbetweenp Kdestimatesand theaverageposteriorprobabilityofancestralstatesacrossthe phylogeny(PearsonandSpearmancorrelations < 0.02; P > 0.98),suggestingthat,overall,ancestralsequenceambiguitydidnothaveastrongeffectonp Kdprediction. Whenp Kdestimatesusingcombinedstructuraltemplates wereplottedonthedsrmphylogeny( fig.4 supplementaryfig. S14 SupplementaryMaterial online),weobservedalarge numberofchangesindsrmRNAaffinityacrossthetree, withonlyafewmajorcladesexhibitingstableaffinityestimates.Themostobvioussuchgroupingwasanimaldsrm3, whichappearstohaveevolvedhighaffinityforRNAearlyin itsevolutionaryhistory(predictedp Kd 5.87fortheancestraldsrm3vs.4.33fortheancestralanimaldsrm; P 9.39e 5) andmaintainedhighaffinityacrossallextantandancestral dsrm3s(meanp Kd 5.53,SE 0.025).Animaldsrm1also appearstohaveevolvedarelativelystableandhighaffinity fordsRNA(meanp Kd 5.03,SE 0.035),exceptinthe mammalianTARBP2lineage,whichlostaffinityforRNA,accordingtoouranalysis(meanp Kd 4.00,SE 0.060).Animal dsrm2'sRNAaffinityappearedgenerallylowerthandsrm1 and3(meanp Kd 4.56,SE 0.021). Overall,plantp Kdpredictionswereslightlylowerthan thoseofanimaldsrms(plantmeanp Kd 4.59,SE 0.014; animalmeanp Kd 4.75,SE 0.018),andweobservedfewer largecladeswithconsistentlyhighorlowRNAaffinitiesinthe plantlineage( supplementaryfig.S14 Supplementary Material online).Overall,plantdsrm1anddsrm2sequences hadsimilarpredictedaffinities(dsrm1meanp Kd 4.55, SE 0.019;dsrm2meanp Kd 4.63,SE 0.022).Within plantdsrm1groups,DRB1hadthehighestaffinityforRNA (meanp Kd 4.72,SE 0.039),andDRB4hadthelowest (meanp Kd 4.39,SE 0.041),buttherewasonlya2.2-fold variationinaverageRNAaffinitiesacrossthemajordsrm1 clades( supplementaryfig.S14 SupplementaryMaterial online).Themajorplantdsrm2cladesexhibitedaslightlyhigher variationinRNAaffinities(2.9-fold).Similartoresultsfrom dsrm1clades,thedsrm2domainfromDRB1hadthehighest affinityforRNA(meanp Kd 4.82,SE 0.039),andDRB4 dsrm2hadthelowestaverageaffinityacrosstheentireclade EvolutionofRNA-BindingProteins doi:10.1093/molbev/msx090MBE7
PAGE 8
(meanp Kd 4.36,SE 0.044).Thereweresomesmaller plantcladeswithconsistentlyhighRNAaffinities( fig.4 ). Forexample,theseconddsrmdomainofSolanaceaeDRB6 hadmeanp Kd 4.93(SE 0.195).AsidefromBrassicaceae andRosaceae,theseconddsrmdomainofeudicotDRB1also hadrelativelyhighaffinityfordsRNA(meanp Kd 4.92, SE 0.037). InordertocharacterizetherateatwhichdsrmRNAaffinityevolvedacrossthephylogeny,wetreatedaffinitysimilar toaquantitativephenotypictrait,applyingaBrownianmotionmodeltoinferchangesintherateofaffinityevolution acrossextantandancestraldsrmdomains( Eastmanetal. 2011 ).Ingeneral,weexpectchangesindsrmRNAaffinity toberoughlycorrelatedwit hchangesindsrmproteinsequence,withsignificantshiftsinthecoefficientofproportionalityindicatingaccelerationordecelerationofaffinitychange, relativetosequencechange.Weinferredshiftsinthe coefficientofproportionalityusingaBayesian"breakpoint" modelacrossthedsrmphylogeny(seeMaterialsand Methods). Wefoundthatwiththeexceptionofearlybranching dsrm1sequencesfromplantDRB6andDRB2/3/5plant dsrmshadahighercoefficientofproportionalitythananimal dsrms( fig.5 supplementaryfig.S15 SupplementaryMaterial online),suggestingthatchangesindsrmRNAaffinity occurredmoreofteninplantsthaninanimals,relativeto dsrmsequencechange.Althoughtheinferenceofstrongly supporteddiscreteshiftsinthec oefficientofproportionality isaknownlimitationofthistypeofevolutionarymodel ( Eastmanetal.2011 ),wedididentifyanumberofdiscrete increasesintherateofdsrmRNAaffinitychangeearlyinthe plantlineage(posteriorprobability > 0.35),aswellasaspatteringofmoreweaklysupportedpossiblechangesinmore terminalplantlineages( fig.5 ).Inanimaldsrms,wefounda otherdsrms DRB6 DRB2/3/5 DRB4 DRB1dsrm1staufendsrms DRB6 DRB1 DRB4 DRB2/3/5dsrm2diploblast dsrms LOQS TARBP2 PRKRAdsrm3 dsrm1 dsrm2R2D2 LOQS TARBP2 PRKRA R2D2LOQS TARBP2PRKRA 0.4 substitutions/site 3.07.0 p<0.05 3456 predictedpKdsML sequencesampled sequences empiricalpKd MLsequence 3456 n.s.3456 3456 n.s.3456 3456 3456 n.s.3456 3456 3456 n.s.3456 n.s.3456 n.s.3456 n.s.3456 n.s. FIG.4. DsrmRNAafnitieschangedoftenacrossanimalandplantlineages.Weinferredthemaximum-likelihoodphylogenyofdsrmprotein sequencesusingastructure-basedalignment(seeMaterialsandMethods).Branchlengthsarescaledtosubstitutions/site,andcladeswith < 0.8 SH-likeaLRTarecollapsed.Ancestraldsrmsequenceswerereconstructedateachnodeonthetree,anddsrmRNAstructuralcomplexeswere inferredbyhomologymodelingandmoleculardynamics(seeMaterialsandMethods).DsrmRNAafnitieswerepredictedbystatisticalmachine learning(seeMaterialsandMethods).WecolorbranchesbytheaveragedsrmRNAbindingafnity(p Kd)acrossmultiplereplicatemodelsofeach ancestralandextantsequenceonthephylogeny,withredindicatinghigh-afnityandblueindicatinglow-afnity.Trianglesindicatebrancheson whichtherewasasignicantchangeinpredictedp Kd,asindicatedbyFDR-correctedindependent t test.Boxesplotthepredictedafnityofthe maximum-likelihoodancestralsequence(darkgray),randomsamplesdrawnfromtheancestralstateprobabilitydistribution(mediumgray)and theexperimentallydeterminedafnity(lightgray)before(bottom)andafter(top)theobservedshift.Barsindicatestandarderrors,andresults thatwerenotsignicant(n.s.)usingeithersampledsequencesorempiricalafnitymeasurementsareindicated. Diasetal. doi:10.1093/molbev/msx090MBE8
PAGE 9
stronglysupporteddiscreteshiftintherateofdsrmRNA affinitychangeinthediploblastlineage(posteriorprobability 0.92),andanotherstronglysupportedshiftinthevertebrateTARBP2/PRKRAdsrm2lineage(posterior probability 0.94; fig.5 ).Wealsoobservedanumberof moreweaklysupportedincreasesintherateofdsrmRNA affinitychangeacrosstheanimalphylogeny( fig.5 ).Overall, weobservedmoresupportfordiscreteincreasesintherateof dsrmRNAaffinityevolutionthandecreases.Resultswere similarwhenweinferredchangesintherateofdsrmRNA affinityevolutionusingthesameBrownian-motionmodel butwithoutconsideringaffinityestimatesfromancestralreconstructedsequences,althoug htheabsoluteratestendedto bemarginallylower( supplementaryfig.S15 Supplementary Material online). Asawhole,theseresultssuggestthatanimalandplant dsrmsequenceslikelyevolved underdifferentdynamics. Animaldsrmsappeartohavedifferentiatedintolow-and high-affinityRNAreceptorsearlier,andaffinitywasmoreconsistentlymaintainedacrosslargertaxonomicgroupings,with anoverallreducedrateofaffinitychange( figs.4 and 5 ; supplementaryfigs.S14andS15 SupplementaryMaterial online). Incontrast,theRNAaffinitiesofplantdsrmsappearmore evolutionarilylabile,withfewerlargecladesexhibitinghigh RNAaffinityandpotentiallymorevariableaffinitiesacross majorclades. PredictionofdsrmRNAaffinitiesacrossalargephylogeny ofancestralandextantproteinspresentsanopportunityto directlyidentifysignificantshiftsinRNAaffinitiesbycomparingthep Kdpredictionofeachancestralproteintothatofits immediatedescendent,therebyidentifyingparticular branchesonwhichdsrmRNAaffinityhaschanged(see MaterialsandMethods).Thisapproachmaynotdetect slowchangesindsrmRNAaffinitiesthatoccuracrossmultiplebranches,anditisunlikelythatthisapproachwillhave equalpoweronallbranchesofthephylogeny.Nonetheless, thissimpleapproachdoesprovideameansforidentifying strong,abruptchangesinprotein-ligandaffinitiesnotlinked tospecifictopologicalevents,suchasgene-ordomainduplications. Aftercorrectingformultipletests,weidentified13 branchesacrossthedsrmphylogenyexhibitingsignificant supportforashiftinRNAaffinity,usingmaximumlikelihoodancestralsequencereconstruction( P < 0.05; fig. 4 ).ManyoftheseobservedshiftsinpredicteddsrmRNA affinitieswerenotrobusttoancestralsequenceambiguity, Plants Animals posteriorrates11.4 4.3 1.7 0.6 0.2 0.09 0.04 0.01 0.005 dsrm3 dsrm1 dsrm2 staufendsrmsdirection1.0 0.8 0.5 0.2 0.0 -0.2 -0.5 -0.8 -1.0probability1.000 0.875 0.750 0.625 0.500 0.375 0.250 0.125rateshiftDRB6 DRB2/3/5 DRB4 DRB1 DRB6 DRB1 DRB2/3/5 DRB4 dsrm2 dsrm1 FIG.5. TherateofdsrmRNAafnityevolutionishigherinplantsthaninanimalsandexhibitsanumberofdiscreteshiftsacrossthedsrm phylogeny.WeinferredtheevolutionoftherateatwhichdsrmRNAafnitychangesusingaBrownianmotion"breakpoint"modelofafnity evolutionttopredicteddsrmRNAafnitiesacrossextantandancestral-reconstructedsequences(seeMaterialsandMethods).Branchesare scaledtotheinferrednumberofproteinsubstitutions/siteandcoloredbytheposteriorratemultiplier,averagedoverfourindependentMCMC runs.RedbranchesindicatefasterevolutionofdsrmRNAafnity,withbluebranchesindicatingslowerevolutionofafnity.Circlesonnodes indicateinferredincreases(red)ordecreases(blue)intheratemultiplier,withthesizeofthecircleindicatingtheposteriorprobabilityofadi screte shiftatthespeciednode.Outgroupbrancheshavebeenremoved.Majortaxonomicandgenefamilylineagesareindicated. EvolutionofRNA-BindingProteins doi:10.1093/molbev/msx090MBE9
PAGE 10
particularlyintheplantlineage( fig.4 ).Whenwereconstructedmultiplereplicateancestralsequencesfromthe posteriordistribution(seeMaterialsandMethods),only 3/8oftheinferredshiftsinplantdsrmRNAaffinityremainedstatisticallysignificant,whereas4/6shiftsobserved intheanimallineagewererobusttoancestralsequence uncertainty( fig.4 ).AllbutoneofthedsrmRNAaffinity shiftsthatwererobusttoancestralsequenceambiguity couldbeexperimentallyverified( fig.4 ). Samplingancestralstatesfromtheposteriordistribution hasbeensuggestedasoneapproachtoalleviatepotential statefrequencybiasesinmaxim um-likelihoodancestralreconstruction( PollockandChang2007 ).However,theincorporationoflow-probabilityancestralresiduesisalsoexpected tointroducealargernumberofpossibleerrors,whichcan collectivelydegradeproteinfunction( Hobbsetal.2012 ).We foundthatp Kdestimatesobtainedfromsampledancestral sequenceswerealmostalwaysthesameasorlessthanestimatesusingmaximum-likelihoodancestralsequences, consistentwithalargernumberofpotentialerrorsintroducedbysampling( figs.2 and 4 ).Someofthesignificant shiftsindsrmRNApreferenceidentifiedusingthe maximum-likelihoodsequencescouldinfactbereal,even iftheyfailedtobeconfirmedbyposteriorsampling( fig.4 ). However,hereweconsideronlythoseshiftsfoundtobe robusttoancestralsequenceambiguity. OneoftheinferredshiftsinanimaldsrmRNAaffinity theshifttohighaffinityalongthebranchleadingtothedsrm3 lineagewasobservedinourearlieranalysis( fig.2 )andwas foundtohaveoccurredviaaQ31Hsubstitutionintheb1b2 loopandtheintroductionofan umberofbasicresiduesinthe a 2region( fig.3 ).Oftheremainingthreeshiftsintheanimal lineage,oneoccurredintheStaufendsrms,andtwooccurred inearlymammals:onea10.0-foldlossofRNAaffinityin mammalianTARBP2dsrm1(bas edonexperimentallydeterminedaffinities, P < 0.012),andtheothera3.79-foldincrease inPRKRAdsrm2'saffinityforRNA( P < 0.036). ThelossofRNAaffinityinmammalianTARBP2dsrm1 occurredatthebaseoftheBoreoeutherianlineage.We hypothesizedthattheinsertionofapairofresiduesupstream oftheRNA-contactingH31wereprimarilyresponsibleforthe observedlossofRNAaffinitybyrepositioningH31outof favorableRNAcontact( DD 29QVinsertion; fig.6 A ).Indeed, introducingthisinsertionintotheancestralTARBP2dsrm1 backgroundreducedRNAaffinitynearly10-fold,whichwas indistinguishablefromthato fthederivedBoreoeutherian TARBP2dsrm1( P > 0.43; fig.6 A supplementaryfig.S16 A SupplementaryMaterial online).Thisinsertionwasstrongly supportedbyancestralsequencereconstruction( supplementarytableS5 SupplementaryMaterial online).Theancestral DD 29stateswerereconstructedwithposteriorprobability > 0.999,aswerethederivedQV29residues. ThesecondmajorchangeinanimaldsrmRNAaffinity occurredintheEutherianmam malPRKRAdsrm2,afterthe Eutherianmammalsdivergedfrommarsupials.Inthiscase, bothancestralandderivedPRKRAdsrm2domainshadthe canonicalH31RNA-contactresi due,althoughtheancestral mammalPRKRAdsrm2boundRNAwithrelativelylowaffinity( fig.6 B ).WehypothesizedthatasingleK33Rsubstitution inthedsrm2b1b2loopwasresponsibleforincreasingRNA affinitybyintroducingfavorablepolarcontacts( fig.6 B ).The ancestralK33residuewasdisengagedfromtheRNAligandin 1 2 2 ancPreBoreoeutherianTARBP2dsrm1a H31 ancBoreoeutherianTARBP2dsrm1b 1 2 2H31 Q29 V30 3456 dsrm1b 29QV dsrm1a 3456 dsrm2b K33R dsrm2a 1 2ancPreEutherianPRKRAdsrm2a K33 F35 1 2Y35 R33 ancEutherianPRKRAdsrm2b 3456 dsrm1b 31H dsrm1a 1 2ancPreFabaceaeDRB4dsrm1a P30 A32 1 2H31 A32 P30 ancFabaceaeDRB4dsrm1b 3456 dsrm1b P30S,V32E dsrm1a 1 2ancPreRosidDRB1dsrm1a H31 P30 V32 1 2H31 S30 ancRosidDRB1dsrm1b E32ABCD FIG.6. ObservedshiftsinanimalandplantdsrmRNAafnitiesareexplainedbysubstitutionsinthe b 1 b 2loopandthe a 2region.We reconstructedancestralanimalandplantdsrmproteinsequencesbefore(bottom)andafter(top)majorshiftsindsrmRNAafnities(see g. 4 )andpredictedthedsrmRNAstructuralcomplexbyhomologymodelingandmoleculardynamics(seeMaterialsandMethods).Weintroduced historicalsubstitutionsoccurringalongthebranchspanningeachfunctionalshiftandmeasureddsrmRNAafnitiesusingan invitro kinetics assay(seeMaterialsandMethods).Weplotthesteady-statedsrmRNAafnityofeachprotein(p Kd),withlongerbarsindicatinghigherafnity. Barsindicatestandarderrors.Kineticscurvesareshownin supplementarygureS16 SupplementaryMaterial online. Diasetal. doi:10.1093/molbev/msx090MBE10
PAGE 11
thestructuralmodel,whereasthederivedR33couldextend intotheRNA'sminorgroovetoformhydrogenbondswith theRNAbase.Consistentwiththishypothesis,introducing theK33Rsubstitutionintotheancestralmammaldsrm2 backgroundwassufficienttoincreasedsrmRNAaffinityto thatofthederivedEutheriandsrm2( P > 0.22; fig.6 B supplementaryfig.S16 B SupplementaryMaterial online).TheancestralK33residuewasreconstructedwithposterior probability0.998,andthederivedR33wasreconstructed withposteriorprobability1.0,suggestingthatancestralreconstructionambiguitydidnotaffectthisresult( supplementary tableS5 SupplementaryMaterial online). Wefoundthatsimilarchangesintheb1b2loopwere responsibleforthetwoobservedincreasesinplantdsrm RNAaffinities( figs.4and6 C D ).BoththeseRNAaffinity shiftsoccurredinplantdsrm1lineages,oneinFabaceae DRB4( fig.6 C )andtheotherinRosidDRB6( fig.6 D ).The ancestralplantDRB4dsrm1l ackedthecanonicalH31RNAcontactresidue(reconstructedas D 31withposteriorprobability0.98;see supplementarytableS5 Supplementary Material online)andbounddsRNAwithrelativelylowaffinity (experimentallydeterminedp Kd 4.31).Introductionofthe H31substitutionintothisbackgroundincreasedaffinity8.1fold,whichwasmarginallyhigherthanthederivedFabaceae DRB4dsrm1( P < 0.046; fig.6 C supplementaryfig.S16 C SupplementaryMaterial online).Finally,theancestralRosid DRB1dsrm1increasedRNAaffinity3-foldafterRosids divergedfromotherplantlineages(fromp Kd 4.41to p Kd 4.89; fig.6 D ).ThisoccurredthroughapairofsubstitutionsflankingtheH31contactresidue,aP30Ssubstitution thatintroducedfavorabledsrmRNApolarcontactsanda V32Esubstitution( fig.6 D ).Introducingthesesubstitutions intotheancestralplantDRB1dsrm1recapitulatedthe observedshiftindsrmRNAaffinityalongtheRosidlineage ( P > 0.37; fig.6 D supplementaryfig.S16 D Supplementary Material online).Asintheanimalshifts,allkeyresiduesaffectingtheseshiftsinplantdsrmRNAaffinitywerereconstructedwithhighconfidence,suggestingancestralsequence ambiguitydidnotaffecttheseresults( supplementarytable S5 SupplementaryMaterial online). Together,theseresultssuggestthatconvergentevolutionarychangesintheb1b2regionofanimalandplantdsrms wereresponsibleforincreasesanddecreasesindsrmRNA affinitiesacrossvariousanimalandplantlineages( figs.3and6 ; supplementaryfigs.S9andS16 SupplementaryMaterial online).TheseindependentchangesaltereddsrmRNAaffinities throughsimilarstructuralmecha nisms:eitherbyestablishing/ interferingwithacriticalH31-RNAcontactorbyaltering dsrmRNApolarcontactswithintheb1b2loopor a 2region.Thesefindingsstronglysuggestthattheb1b2loopisa "hotspot"for"tinkering"withdsrmRNAaffinitiesacrossa verybroadevolutionarytimespan. WenotethatnotallchangesindsrmRNAaffinitieswere identifiedbyourphylogeny-widescan;someofthechanges identifiedduringourstudyofearlydsrmdiversificationwere notfound( figs.2and4 ).Thissuggeststhatthephylogenywidescanapproachisnotadirectreplacementforother methodsusedtoidentifypotentialshiftsinancestral molecularfunctionbutcouldbecomplementary,potentially identifyingchangesinmolecularfunctionnotreadilypredictedbyothermeans.Wealsonotethattherearesome differencesbetweencomputationallypredictedandexperimentallydeterminedp Kdestimates( figs.2and4 );thisis expected,giventhatthestatisticalpredictionalgorithmwas trainedacrossawidevarietyofproteinRNAandprotein DNAcomplexes( DiasandKolazckowski2015 ),andtheRNA crystalizedwithTARBP2andDCL1templatesisshortand maynotengagetheentirepotentialRNA-bindingregion ( RyterandSchultz1998 ; Yangetal.2010 ).Particularitiesof theexperimentalconditionsc analsohavealargeeffecton affinitymeasurements( Svecetal.1980 ; Reverberiand Reverberi2007 ).Nonetheless,thepatternsofchangesinaffinityweregenerallyconsiste ntbetweencomputationaland experimentalapproaches,suggestingthatcomputationalpredictionofproteinRNAaffinitiesisapotentiallyusefulstrategyforexaminingbroad-scalechangesinmolecularfunction acrosstheevolutionaryhistoriesofRNA-bindingproteins.ConclusionsThecontinuedexplosionof"bigdata"inbiologyhasgeneratedparticularchallengesthatcutacrossfields;oneofwhich ishowbesttosortthroughlarge,complexdatasetstoidentifyspecifichypothesesthatcanberigorouslytestedexperimentally.Ancestralsequenceresurrectionstudieshave historicallyreliedonanad-hocassortmentofheuristicsto identifyparticularancestralnodesforfunctionalanalysis, includingexamininggeneduplicationpatternsorpatterns ofbranchlengths,characterizingchangesinselectionand projectingfunctionaldiversityofextantproteins"backin time"alongthephylogeny( Malcolmetal.1990 ; Shihetal. 1993 ; Ugaldeetal.2004 ; Bridghametal.2006 ; Bridghametal. 2009 ; ZmasekandGodzik2011 ; Voordeckersetal.2012 ; van Hazeletal.2013 ; OgawaandShirai2014 ; Whitfieldetal.2015 ; CliftonandJackson2016 ).Whiletheseapproachesareuseful, theyareindirectassessmentsofthehypothesisunderexamination,whichiswhenandhowmolecularfunctionhas changedacrossaproteinfamily'sphylogeny. Herewehavepresentedastatisticalapproachfordirectly examiningchangesinmolecularfunctionacrosslargephylogeniescomputationally.Wehaveappliedthistechniqueto studytheevolutionofligandaffinityinafamilyofanimal andplantdouble-strandedRNAbindingproteinscontributingtoRNAinterferenceandde monstrateditscapacityto identifyshiftsinmolecularfunctionthatwerethenconfirmed experimentally.Thescalabilit yofthisapproachallowsresearcherstodirectlyexaminetheeffectsofancestralsequence ambiguityandothersourcesofuncertaintyonfunctional inferences,whichisdifficulttoachieveusinglowthroughputexperiments.Weexpectthatsimilarcomputationalapproacheswillhelpinformfutureancestralsequence resurrectionstudies,ultimatelyprovidingadirectandunbiasedviewofhowproteinfamiliesevolvefunctional diversity. Ourresultsdemonstratehowindividualdsrmfunctional domainswithinanimalandplantDRBproteinshavegained EvolutionofRNA-BindingProteins doi:10.1093/molbev/msx090MBE11
PAGE 12
andlostaffinityfordsRNAthroughevolutionarytinkeringat twoprimarydsrmRNAstructuralinterfaces.However,the implicationsofthesechangesindsrmRNAaffinityforDRB functionorforthefunctioningoftheRNAinterference systemstheyparticipateinremainunclear.Inadditionto bindingRNA,DRBdsrmshavebeenshowntodirectlymediateinteractionswithDicersinanimalsandplants( Kurihara etal.2006 ; Wilsonetal.2015 ),buttheextenttowhichdsrm RNAanddsrmproteinbindingmayinvolveevolutionary "trade-offs"inspecializationisnotclear.Inhumans,DRBs appeartointeractdirectlywithashortprotein-bidingdomain withintheDicerHelicase( Wilsonetal.2015 ),potentially alteringthestructuraldynamic sandcatalyticefficiencyof theDRBDicerRNAsystem,particularlyunderconditions ofhighRNAconcentrations( Tayloretal.2013 ; Farehetal. 2016 ).WhileitisconceivablethatchangesindsrmRNA affinitycouldimpactthefunctionaldynamicsoftheDRB DicerRNAsystem,thishasno tbeenexamined.DRBshave alsobeenshowntohelpdeterminespecificityofRNAinterferencepathwaysinarthropods,althoughthestructural mechanismsarenotknown( Liuetal.2006 ; Zhouetal. 2009 ; Marquesetal.2010 ; HartigandForstemann2011 ). PlantDicers(aka,"Dicer-like"or"DCL")lacktheproteinbindingdomainfacilitatingDRBDicerinteractionsinanimals,andappeartointeractviadsrmdsrmcontacts ( Kuriharaetal.2006 ),althoughthestructuralinterfacehas notbeendetermined.Thepotentialdoesappeartoexistfor evolutionofDRBfunctiontoimpactRNAinterference throughpossibleeffectsonDicerprocessingofRNAtargets. However,furtherexaminationofDRBDicerRNAinteractionswithinanexplicitevolutionaryframeworkwillbe requiredtobeginlinkingspecificchangesinDRBsequence topotentialchangesinRNAiprocessing.MaterialsandMethodsDRBSequenceIdentification,Alignment,and PhylogeneticAnalysisProteinsequencescontainingatleastonedouble-stranded RNA-bindingmotif(dsrm,NCBIconserveddomaindatabase idCD00048)wereidentifiedbyrpsblastsearchoftheNR databaseusingan e -valuecutoffof0.01( Marchler-Bauer andBryant2004 ; Marchler-Baueretal.2015 ; Coordinators 2016 ).Double-strandedRNA-bindingproteins(DRBs)were identifiedasfull-lengthproteinsequencescontaining23 dsrmsandnootherannotatedfunctionaldomainswith e value < 0.01. Full-lengthDRBproteinsequenceswerealignedusing ClustalOmegav1.2.3( Sieversetal.2011 ),MUSCLEv3.8.31 ( Edgar2004 ),mafft-einsiv7.215( KatohandStandley2013 ), andMSAProbsv0.9.7( Liuetal.2010 )withdefaultparameters. Alignmentsofonlyannotatedfunctionaldomainswith interveningsequenceremovedwerealsoproducedusing thesamemethods.AlignmentswereleftunprocessedorprocessedbyGblocksv0.91toremovepotentiallyambiguous regions( TalaveraandCastresana2007 ).Wesettheminimum numberofsequencesforaflankposition(-b2)equalto3/5 thetotalnumberofsequencesinthealignment.The maximumnumberofcontiguou snonconservedpositions (-b3)wassetto10.Theminimumblocklength(-b4)was 5,andgappositionswereallowed(-b5 a).OtherGblocks parameterswereleftatdefaultvalues. Initialmaximumlikelihoodphylogenieswereconstructed fromeachalignmentusingFastTreev2.1.7withdefaultparameters( Priceetal.2010 ).Initialtreeswereusedasstarting treesforfullmaximum-likelihoodreconstructionusing RAxMLv8.0.24( Stamatakis2014 ),withthebest-fitevolutionarymodelselectedfromeachalignmentusingAICinProtTest v3( Darribaetal.2011 ).CladesupportwasevaluatedbySHlikeaLRTscores( AnisimovaandGascuel2006 ).Maximumlikelihoodphylogeniesproducedfromeachalignmentwere convertedtoacladepresenceabsencematrixusingthe SuperTreeToolkitv0.1.2( HillandDavis2014 ),andasupertreewasinferredfromthismatrixusingtheBINCATmodelin RAxML( Nguyenetal.2012 ).Wealsoconcatenatedallindividualalignmentsintoasinglesupermatrixandreconstructed themaximum-likelihoodproteinfamilyphylogenyusing RAxML,withthebest-fitevolutionarymodelselectedby AIC( Wheeleretal.1995 ).Wepresentaconsensusof "supertree"and"supermatrix"results.DsrmFunctionalDomainIdentification,Structural Modeling,andRNAAffinityPredictionWeidentifiedalldsrmfunctionaldomainsfromtheRefSeq database( Pruittetal.2007 )usingtheapproachdescribedin theprevioussection.Dsrmproteinsequenceswereclustered usingMCLv14-137( Enrightetal.2002 ).Wecalculatedall-vs.allblastdistancesamongidentifieddsrmswithan e -value cutoffof0.1. E -valueswerelog10-tranformedandcapped to 200.Nodedegreeswerecappedat280,whichwasthe smallestmaximumnodedegreethatmaintainedafullyconnectedgraph.MCLclusteringwasperformedatvariousinflationparameters(1.01,1.05,1.1,1.15,1.2,1.4,1.6,1.8,2.0,and 3.0)afterpre-inflatingthegraph(-pi3)toimprovecontrast betweenhighandlowedgeweights.AnnotatedDRBsfrom H. sapiens D.melanogaster ,and A.thaliana genomeswere mappedtoclusters,andweselectedtheoptimalMCLclusteringasthatwhichmaximizedthenumberofannotated DRBspercluster.Allsequenceswithinanyclustercontaining atleastoneannotatedDRBwereconsideredpotentialclosely relatedDRBhomologs. Dsrmsequencescloselyrela tedtothosefromDRBswere alsoidentifiedphylogenetically.Alldsrmproteinsequences werealignedusingthemethodsdescribedabove,and maximum-likelihoodphylogenieswereinferredfromeach dsrmalignment.AnydsrmsequencesgroupingwithannotatedDRBsfrom H.sapiens D.melanogaster and A.thaliana withSH-likeaLRT > 0.9wereconsideredcloselyrelated,and wecombinedcloselyrelateddsrmsfromMarkovclustering andphylogeneticanalysis. Weidentifiedexperimentally determineddsrmstructures bysequencesearchoftheRCSBproteindatabank( Roseetal. 2013 ),usingdsrmsfromannotatedhuman, D.melanogaster and A.thaliana DRBsasqueriesandan e -valuecutoffof0.01. ResultingX-rayandNMRstructureswerealignedusingthe cealignalgorithminPymolv1.8.1.Weusedthemafftadd Diasetal. doi:10.1093/molbev/msx090MBE12
PAGE 13
parametertoaligndsrmproteinsequencestothestructurebasedalignment.Weinferredthemaximum-likelihooddsrm domaintreefromthestructure-basedalignment,collapsed nodeswith < 0.8SH-likeaLRTsupportandreconstructed ancestraldsrmsequencesateachnodeonthephylogeny bymaximum-likelihood( Yangetal.1995 ).Weadditionally sampled20ancestraldsrmsequencesateachnodefromthe posteriordistributionofresiduesreconstructedateachsite ( PollockandChang2007 ). Foreachancestralandextantdsrmproteinsequence,we usedMODELLERv9.14( Eswaretal.2008 )toinferstructural modelsofthedsrmboundtodouble-strandedRNA,using humanTARBP2(PDBID:3ADL)and A.thaliana DRB1(PDB ID:3ADI)astemplates( Yangetal.2010 ).Usingeachtemplate,weconstructed100potentialstructuralmodelsand selectedthebestoneusingthemodelerobjectivefunction (molpdf),DOPEandDOPEHRscores( ShenandSali2006 ). Eachscorewasre-scaledtounits ofstandard-deviationacross the100models,andweselectedthebestmodelasthatwith thebestaverageofre-scaledmolpdf,DOPEandDOPEHR scores. EachinitialdsrmRNAstructuralmodelwasusedasa startingpointforashortmoleculardynamicssimulation usingGROMACSv4.6.5( Pronketal.2013 ).Weusedthe amber99sb-ildnforcefieldandthetip3pwatermodel. Initialdynamicstopologiesweregeneratedusingthe GROMACSpdb2gmxalgorithmwithdefaultparameters. TopologieswererelaxedintosimulatedsolventatpH 7 usinga50,000-stepsteepest-d escentenergyminimization. Thesystemwasthenbroughtto300Kusinga50-psdynamicssimulationunderpositionalrestraints,followedby pressurestabilizationforanadditional50ps.Simulations wererunusingParticle-MeshEwaldelectrostaticswithcubic interpolationandgridspacingof0.12nm.VanderWaals forceswerecalculatedusingacutoffof1.0nm.Weused NoseHoovertemperaturecoupling,withprotein,RNA andsolventsystemscoupledseparatelyandtheperiodof temperaturefluctuationssetto0.1ps.Pressurecoupling wasappliedusingtheParrinelloRahmanapproach,witha fluctuationperiodof2.0ps.Nonbondedcutoffsweretreated usingbufferedVerletlists.Weselectedfivecomplexesfrom thelast20psofeachpressurestabilizationsimulationfor affinityprediction. DsrmRNAaffinitieswerepredictedfromstructuralcomplexesusingastatisticalmachinelearningapproach( Diasand Kolazckowski2015 ).Simulatedsolventandionswere excludedfromtheproteinRN Acomplex,thebindingsite wasidentified,andproteinRNAinteractionsweredecomposedintoavectorofatomatominteractionfeatureslikely tocorrelatewithbindingaffinity,asdescribedin( Diasand Kolazckowski2015 ).Affinities[reportedasp Kd log( Kd)] werepredictedusingasupport vectorregressionmodelpreviouslytrainedusingalargenumberofproteinRNAand proteinDNAcomplexeswithassociatedexperimentalaffinitymeasurements.Wereportthemeanofpredictedaffinities acrossthefivecomplexessampledfromeachdsrmstructural model.Differencesinpredictedp Kdswereassessedusinga two-tailedunpaired t test,assumingunequalvariancesand correctingformultipletestsusinganFDRcorrection ( BenjaminiandHochberg1995 ).Wecharacterizedtheimpact ofancestralsequenceambiguityonpredictedproteinRNA affinitiesbycalculatingPearsonandSpearmancorrelations betweenp Kdestimatesandtheaverageposteriorprobability ofancestralstatesateachnode.Significancewasevaluated usingtheStudent's t -test.BrownianMotionModelingofdsrmRNAAffinity EvolutionWemodeledtheevolutiono fdsrmRNAaffinityusinga Brownianmotionprocess( Felsenstein1973 ; Eastmanetal. 2011 ),inwhichweallowedtherateofaffinityevolutiontobe proportionaltothenumberofsubstitutions/sitealongeach branchofthephylogeny.Thecoefficientofproportionality wastreatedasafreemodelparameter,andweinferred changesinthisparameter'sva lueusingreversible-jump MarkovchainMonteCarlo( Eastman,etal.2011 ).Proposed changesinthecoefficientofrateproportionality(i.e.,"rate shifts")wereassumedtobeinheritedbydescendentnodes onthephylogeny,unlesssubsequentrateshiftswerealso presentinadescendentsubtree.FourindependentMCMC runswereperformedusingthefullmodelofBrownianmotionincludingjumpswithrelaxedrates(type jump-rbm) for100,000generations,sampledevery100generations,and thefirst25%ofsampleswerediscardedasburnin.Weconfirmedthattheaveragestandarddeviationinrateshiftposteriorprobabilitieswas < 0.01acrossindependentruns, suggestingthatMCMCchainshadconvergedtothestationarydistribution( Ronquistetal.2012 ).Wereportposterior probabilitiescombinedfromallfourindependentruns. MCMCanalyseswereconductedusingeitherextant ancestralaffinitypredictions(p Kds,seeabove)oronlyusingaffinity predictionsfromextantsequences.Standarderrorsinaffinity predictionswereincludedinallBrownianmotionmodels.ExperimentalMeasurementofdsrmRNAAffinityWegeneratedblunt-endedGC-rich28-bpRNAmolecules invitro usingT7RNAreversetranscriptaseandsynthetic dsDNAastemplate.ComplementarypurifiedsinglestrandedRNAswereannealedtoproducedouble-stranded RNAbycombiningat1:1ratio,heatingto95Cfor5minand thencoolingto25C.Blunt-endeddsRNAwasproducedby exposuretoalkalinephosphatase.The30endofoneRNA strandwasbiotinylatedtofacilitatekineticsassaysusingthe PierceTM30EndRNABiotinylationKit(Thermo). Ancestralandextantdsrmswereexpressedin E.coli RosettaTM2(DE3)pLysScellsusingpET-22b( )constructs, whichwereverifiedbySangerseq uencing.ProteinswerepurifiedbyHis-affinitypurificationandvisualizedbySDS-page stainedwith1%coomassie.Proteinconcentrationswere measuredusingalinear-transformedBradfordassay( Zor andSelinger1996 ). WemeasureddsrmRNAbindingusingalabel-free invitro kineticsassayatpH 7( Abdicheetal.2008 ; Frenzeland Willbold2014 ).BiotinylatedRNAmoleculeswereboundto aseriesofeightstreptavidinprobesfor5min,untilsaturation wasobserved.Probeswerewashedandthenexposedto EvolutionofRNA-BindingProteins doi:10.1093/molbev/msx090MBE13
PAGE 14
25 m g/mlbiocytintobindanyremainingfreestreptavidin. Eachprobewasthenexposedtodsrmsatincreasingconcentrationsin1 KineticsBuffer(ForteBio)for6min,followedby dissociationinKineticsBufferforanadditional4minbefore exposuretothenextconcentrationofdsrmprotein( Frenzel andWillbold2014 ).Molecularbindingateachconcentration overtimewasmeasuredasthechangeinlaserwavelength whenreflectedthroughtheprobeinsolution,sampledevery 3ms.Twoprobeswerenotexposedtodsrmproteinascontrolstoevaluatesystemfluctuationacrossthetimeofthe experiment;measurementsfromthesecontrolprobeswere averagedandsubtractedfromeachanalysisprobe. Foreachreplicateexperiment,weestimatedthedsrm concentrationatwhich1=2-maximalsteady-stateRNAbindingwasachieved( Kd)byfittingaone-sitebindingcurvetothe steady-statelaserwavelengthsmeasuredacrossdsrmconcentrationsatsaturation,usingnonlinearregression.Weadditionallyfit1-siteassociation/dissociationcurvestothefull time-coursedatainordertoestimatetheinitialratesofRNA bindingacrossdsrmconcentrationsandusedtheseratesto calculatethedsrmconcentrationatwhichthe1=2-maximal RNA-bindingratewasachieved( Km). Kdsand Kmswere log10transformedtofacilitatevisualization,andstandard errorsacrossthreeexperimentalreplicateswerecalculated. Wecalculatedthestatisticalsignificanceofdifferencesbetween Kdsand K msusingthetwo-tailedunpaired t test, assumingunequalvariances.DataAvailabilityThestructuralalignmentofdsrmdomainsandallphylogenetictreesreconstructedinthisstudyareavailablein supplementaryfilefull_trees.nexus.txt SupplementaryMaterial onlinewithidentifiersmappedtoNCBIaccessionsin supplementaryfilesDRB_full_idmap.txtanddsrm_full_idmap.txt SupplementaryMaterial online.Ancestral-reconstructedsequencesareprovidedin supplementaryfileancestral_dsrms. fasta.txt SupplementaryMaterial online.Software,statistical models,usagetutorials,andpro teinRNAaffinitypredictions areavailableonlineat:https://gi thub.com/Klab-Bioinfo-Tools/ GLM-Score(lastaccessedFebruary21,2017). Supplementary text,datatables,figures ,andreferencesareavailablein SupplementaryFileSI_01.pdf SupplementaryMaterial online.SupplementaryMaterialSupplementarydata areavailableat MolecularBiologyand Evolution online.AcknowledgmentThisworkwassupportedbytheNationalScienceFoundation (MolecularandCellularBiology,grantnumber 1412442 ). PublicationofthisarticlewasfundedinpartbytheUniversity ofFloridaOpenAccessPublishingFund.ReferencesAbdicheY,MalashockD,PinkertonA,PonsJ.2008.Determiningkinetics andaffinitiesofproteininteractionsusingaparallelreal-timelabelfreebiosensor,theOctet. AnalBiochem. 377:209217. AnisimovaM,GascuelO.2006.Approximatelikelihood-ratiotestfor branches:afast,accurate,andpowerfulalternative. SystBiol. 55:539552. BenjaminiY,HochbergY.1995.Controllingthefalsediscoveryrate:a practicalandpowerfulapproachtomultipletesting. JRStatSocSer B .57:289300. BlevinsT,RajeswaranR,ShivaprasadPV,BeknazariantsD,Si-AmmourA, ParkHS,VazquezF,RobertsonD,MeinsF,Jr.,HohnT,etal.2006. FourplantDicersmediateviralsmallRNAbiogenesisandDNAvirus inducedsilencing. NucleicAcidsRes. 34:62336246. BoucheN,LauresserguesD,GasciolliV,VaucheretH.2006.AnantagonisticfunctionforArabidopsisDCL2indevelopmentandanew functionforDCL4ingeneratingviralsiRNAs. EMBOJ .25:33473356. BridghamJT,BrownJE,Rodriguez -MariA,CatchenJM,ThorntonJW. 2008.Evolutionofanewfunctionbydegenerativemutationin cephalochordatesteroidreceptors. PLoSGenet. 4:e1000191. BridghamJT,CarrollSM,ThorntonJW.2006.Evolutionofhormonereceptorcomplexitybymo lecularexploitation. Science 312:97101. BridghamJT,KeayJ,OrtlundEA,ThorntonJW.2014.Vestigializationof anallostericswitch:geneticandstructuralmechanismsfortheevolutionofconstitutiveactivity inasteroidhormonereceptor. PLoS Genet. 10:e1004058. BridghamJT,OrtlundEA,ThorntonJW.2009.Anepistaticratchetconstrainsthedirectionofglucocorticoidreceptorevolution. Nature 461:515519. CaiJ,ZhaoR,JiangH,WangW.2008.Denovooriginationofanew protein-codinggenein Saccharomycescerevisiae Genetics 179:487496. CenikES,FukunagaR,LuG,DutcherR,WangY,TanakaHallTM, ZamorePD.2011.PhosphateandR2D2restrictthesubstratespecificityofDicer-2,anATP-drivenribonuclease. MolCell 42:172184. ChenS,KrinskyBH,LongM.2013.Ne wgenesasdriversofphenotypic evolution. NatRevGenet. 14:645660. ChendrimadaTP,GregoryRI,KumaraswamyE,NormanJ,CoochN, NishikuraK,ShiekhattarR.2005.TRBPrecruitstheDicercomplex toAgo2formicroRNAprocessingandgenesilencing. Nature 436:740744. ClavelM,PelissierT,MontavonT,TschoppMA,Pouch-PelissierMN, DescombinJ,JeanV,DunoyerP,Bousquet-AntonelliC,DeragonJM. 2016.Evolutionaryhistoryofdouble-strandedRNAbindingproteins inplants:identificationofnewcofactorsinvolvedineasiRNAbiogenesis. PlantMolBiol. 91:131147. CliftonBE,JacksonCJ.2016.Ancestralproteinreconstructionyieldsinsightsintoadaptiveevolutionofbindingspecificityinsolute-binding proteins. CellChemBiol. 23:236245. ConantGC,WolfeKH.2008.Turningahobbyintoajob:howduplicated genesfindnewfunctions. NatRevGenet. 9:938950. CoordinatorsNR.2016.DatabaseresourcesoftheNationalCenterfor BiotechnologyInformation. NucleicAcidsRes .44:D7D19. CordauxR,BatzerMA.2009.Theimpactofretrotransposonsonhuman genomeevolution. NatRevGenet. 10:691703. CurtinSJ,WatsonJM,SmithNA,Eame nsAL,BlanchardCL,Waterhouse PM.2008.TherolesofplantdsRNA-b indingproteinsinRNAi-like pathways. FEBSLett .582:27532760. DaherA,LarakiG,SinghM,Melendez-PenaCE,BannwarthS,PetersAH, MeursEF,BraunRE,PatelRC,GatignolA.2009.TRBPcontrolof PACT-inducedphosphorylationo fproteinkinaseRisreversedby stress. MolCellBiol.29:254265. DarribaD,TaboadaGL,DoalloR,PosadaD.2011.ProtTest3:fastselectionofbest-fitmodelsofproteinevolution. Bioinformatics 27:11641165. DiasR,KolazckowskiB.2015.Differentcombinationsofatomicinteractionspredictprotein-smallmoleculeandprotein-DNA/RNAaffinitieswithsimilaraccuracy. Proteins 83:21002114. DickermanBK,WhiteCL,KesslerPM,SadlerAJ,WilliamsBR,SenGC. 2015.TheproteinactivatorofproteinkinaseR,PACT/RAX,negativelyregulatesproteinkinaseRduringmouseanteriorpituitary development. FEBSJ. 282:47664781. Diasetal. doi:10.1093/molbev/msx090MBE14
PAGE 15
DuarteGT,MatiolliCC,PantBD,SchlerethA,ScheibleWR,StittM, VicentiniR,VincentzM.2013.InvolvementofmicroRNA-related regulatorypathwaysintheglucose -mediatedcontrolofArabidopsis earlyseedlingdevelopment. JExpBot. 64:43014312. DunningHotoppJC,ClarkME,OliveiraDC,FosterJM,FischerP,Munoz TorresMC,GiebelJD,KumarN,IshmaelN,WangS,etal.2007. Widespreadlateralgenetransferfro mintracellularbacteriatomulticellulareukaryotes. Science 317:17531756. EastmanJM,AlfaroME,JoyceP,HippAL,HarmonLJ.2011.Anovel comparativemethodforidentifyingshiftsintherateofcharacter evolutionontrees. Evolution 65:35783589. EdgarRC.2004.MUSCLE:multiplesequencealignmentwithhighaccuracyandhighthroughput. NucleicAcidsRes. 32:17921797. EnrightAJ,VanDongenS,OuzounisCA.2002.Anefficientalgorithmfor large-scaledetectionofproteinfamilies. NucleicAcidsRes. 30:15751584. EswarN,EramianD,WebbB,ShenMY,SaliA.2008.Proteinstructure modelingwithMODELLER. MethodsMolBiol. 426:145159. FarehM,YeomKH,HaagsmaAC,ChauhanS,HeoI,JooC.2016.TRBP ensuresefficientDicerprocessingofprecursormicroRNAinRNAcrowdedenvironments. NatCommun. 7:13694. FelsensteinJ.1978.Casesinwhichparsimonyorcompatibilitymethods willbepositivelymisleading. SystZool. 27:401410. FelsensteinJ.1973.Maximum-likelih oodestimationofevolutionarytrees fromcontinuouscharacters. AmJHumGenet. 25:471492. FrenzelD,WillboldD.2014.Kinetictitrationserieswithbiolayerinterferometry. PLoSOne 9:e106882. FukunagaR,HanBW,HungJH,XuJ,WengZ,ZamorePD.2012.Dicer partnerproteinstunethelengthofmaturemiRNAsinfliesand mammals. Cell 151:533546. GrishokA,PasquinelliAE,ConteD,LiN,ParrishS,HaI,BaillieDL,FireA, RuvkunG,MelloCC.2001.GenesandmechanismsrelatedtoRNA interferenceregulateexpressionofthesmalltemporalRNAsthat control C.elegans developmentaltiming. Cell 106:2334. Hanson-SmithV,KolaczkowskiB, ThorntonJW.2010.Robustnessof ancestralsequencereconstructio ntophylogeneticuncertainty. MolBiolEvol. 27:19881999. HarmsMJ,ThorntonJW.2010.Analyzingproteinstructureandfunction usingancestralgenereconstruction. CurrOpinStructBiol. 20:360366. HartigJV,ForstemannK.2011.Loqs-PDandR2D2defineindependent pathwaysforRISCgenerationinDrosophila. NucleicAcidsRes. 39:38363851. HillJ,DavisKE.2014.TheSupertreeToolkit2:anewandimproved softwarepackagewithaGraphicalUserInterfaceforsupertreeconstruction. BiodiversDataJ .e1053. HobbsJK,ShepherdC,SaulDJ,DemetrasNJ,HaaningS,MonkCR,Daniel RM,ArcusVL.2012.Ontheoriginandevolutionofthermophily: reconstructionoffunctionalprec ambrianenzymesfromancestors ofBacillus. MolBiolEvol. 29:825835. HughesAL.1994.Theevolutionoffunctionallynovelproteinsaftergene duplication. ProcRSocLondBBiolSci. 256:119124. KassahnKS,DangVT,WilkinsSJ,PerkinsAC,RaganMA.2009.Evolution ofgenefunctionandregulatorycontrolafterwhole-genomeduplication:comparativeanalysesinvertebrates. GenomeRes. 19:14041418. KatohK,StandleyDM.2013.MAFFTm ultiplesequencealignmentsoftwareversion7:improvementsi nperformanceandusability. MolBiolEvol. 30:772780. KettingRF,FischerSE,BernsteinE,SijenT,HannonGJ,PlasterkRH.2001. DicerfunctionsinRNAinterferenceandinsynthesisofsmallRNA involvedindevelopmentaltimingin C.elegans GenesDev. 15:26542659. KingN,HittingerCT,CarrollSB.2003.Evolutionofkeycellsignalingand adhesionproteinfamiliespredatesanimalorigins. Science 301:361363. KloostermanWP,PlasterkRH.2006.ThediversefunctionsofmicroRNAs inanimaldevelopmentanddisease. DevCell 11:441450. KnightSW,BassBL.2001.ArolefortheRNaseIIIenzymeDCR-1inRNA interferenceandgermlinedevelopmentin Caenorhabditiselegans Science 293:22692271. KokKH,NgMH,ChingYP,JinDY.2007.HumanTRBPandPACT directlyinteractwitheachotherandassociatewithdicertofacilitate theproductionofsmallinterferingRNA. JBiolChem. 282:1764917657. KuckP,MayerC,WageleJW,MisofB.2012.Longbrancheffectsdistort maximumlikelihoodphylogeniesinsimulationsdespiteselectionof thecorrectmodel. PLoSOne 7:e36593. KurakuS.2013.Impactofasymmetricgenerepertoirebetweencyclostomesandgnathostomes. SeminCellDevBiol. 24:119127. KuriharaY,TakashiY,WatanabeY.2006.TheinteractionbetweenDCL1 andHYL1isimportantforefficie ntandpreciseprocessingofprimiRNAinplantmicroRNAbiogenesis. RNA 12:206212. LarakiG,ClerziusG,DaherA,Melendez-PenaC,DanielsS,GatignolA. 2008.Interactionsbetweenthedouble-strandedRNA-bindingproteinsTRBPandPACTdefinetheMedipaldomainthatmediates protein-proteininteractions. RNABiol .5:92103. LiuB,ChenZ,SongX,LiuC,CuiX,ZhaoX,FangJ,XuW,ZhangH,Wang X,etal.2007. Oryzasativa dicer-like4revealsakeyroleforsmall interferingRNAsilencinginplantdevelopment. PlantCell 19:27052718. LiuX,JiangF,KalidasS,SmithD,LiuQ.2006.Dicer-2andR2D2coordinatelybindsiRNAtopromoteassemblyofthesiRISCcomplexes. RNA 12:15141520. LiuY,SchmidtB,MaskellDL.2010.MS AProbs:multiplesequencealignmentbasedonpairhiddenMarkovmodelsandpartitionfunction posteriorprobabilities. Bioinformatics 26:19581964. LuR,MaduroM,LiF,LiHW,Broitman-MaduroG,LiWX,DingSW.2005. AnimalvirusreplicationandRNAi-mediatedantiviralsilencingin Caenorhabditiselegans Nature 436:10401043. MalcolmBA,WilsonKP,MatthewsBW,KirschJF,WilsonAC.1990. Ancestrallysozymesreconstructed,neutralitytested,andthermostabilitylinkedtohydrocarbonpacking. Nature 345:8689. Marchler-BauerA,BryantSH.2004. CD-Search:proteindomainannotationsonthefly. NucleicAcidsRes .32:W327W331. Marchler-BauerA,DerbyshireMK,GonzalesNR,LuS,ChitsazF,GeerLY, GeerRC,HeJ,GwadzM,HurwitzDI,etal.2015.CDD:NCBI's conserveddomaindatabase. NucleicAcidsRes .43:D222D226. MarquesJT,KimK,WuPH,AlleyneTM,JafariN,CarthewRW.2010. LoqsandR2D2actsequentiallyinthesiRNApathwayin Drosophila NatStructMolBiol. 17:2430. MerklR,SternerR.2016.Ancestralproteinreconstruction:techniques andapplications. BiolChem. 397:121. MukherjeeK,CamposH,Kolaczkowsk iB.2013.Evolutionofanimaland plantdicers:earlyparallelduplicationsandrecurrentadaptationof antiviralRNAbindinginplants.MolBiolEvol .30:627641. NagA,JackT.2010.Sculptingtheflower;theroleofmicroRNAsinflower development. CurrTopDevBiol. 91:349378. NguyenBaAN,StromeB,HuaJJ,DesmondJ,Gagnon-ArsenaultI,Weiss EL,LandryCR,MosesAM.2014.Detectingfunctionaldivergence aftergeneduplicationthroughevolutionarychangesinposttranslationalregulatorysequences. PLoSComputBiol. 10:e1003977. NguyenN,MirarabS,WarnowT.2012.MRLandSuperFine MRL:new supertreemethods. AlgorithmsMolBiol. 7:3. OgawaT,ShiraiT.2014.Tracingancestralspecificityoflectins:ancestral sequencereconstructionmethodasanewapproachinprotein engineering. MethodsMolBiol. 1200:539551. OgdenTH,RosenbergMS.2006.Multiplesequencealignmentaccuracy andphylogeneticinference. SystBiol. 55:314328. OhnoS.1984.Birthofauniqueenzymefromanalternativereading frameofthepreexisted,internallyrepetitiouscodingsequence. Proc NatlAcadSciUSA. 81:24212425. OrengoCA,ThorntonJM.2005.Proteinfamiliesandtheirevolutiona structuralperspective. AnnuRevBiochem. 74:867900. PaoGM,SaierMH.Jr.1995.Responseregulatorsofbacterialsignaltransductionsystems:selectivedomainshufflingduringevolution. JMol Evol. 40:136154. EvolutionofRNA-BindingProteins doi:10.1093/molbev/msx090MBE15
PAGE 16
PollockDD,ChangBS.2007.Dealingwithuncertaintyinancestralsequencereconstruction:samplingfromtheposteriordistribution.In: LiberlesDA,editor.Ancestralsequencereconstruction.Oxford: OxfordUniversityPress. PriceMN,DehalPS,ArkinAP.2010.FastTree2approximately maximum-likelihoodtreesforlargealignments. PLoSOne 5:e9490. PronkS,PallS,SchulzR,LarssonP,BjelkmarP,ApostolovR,ShirtsMR, SmithJC,KassonPM,vanderSpoelD,etal.2013.GROMACS4.5:a high-throughputandhighlyparall elopensourcemolecularsimulationtoolkit. Bioinformatics 29:845854. PruittKD,TatusovaT,MaglottDR.2007.NCBIreferencesequences (RefSeq):acuratednon-redundantsequencedatabaseofgenomes, transcriptsandproteins. NucleicAcidsRes. 35:D61D65. QuF,YeX,MorrisTJ.2008.ArabidopsisDRB4,AGO1,AGO7,andRDR6 participateinaDCL4-initiatedantiviralRNAsilencingpathwaynegativelyregulatedbyDCL1. ProcNatlAcadSciUSA .105:1473214737. RastogiS,LiberlesDA.2005.Subfunctionalizationofduplicatedgenesas atransitionstatetoneofunctionalization. BMCEvolBiol. 5:28. ReverberiR,ReverberiL.2007.Fact orsaffectingtheantigen-antibody reaction. BloodTransfus. 5:227240. RonquistF,TeslenkoM,vanderMarkP,AyresDL,DarlingA,HohnaS, LargetB,LiuL,SuchardMA,HuelsenbeckJP.2012.MrBayes3.2: efficientBayesianphylogeneticinferenceandmodelchoiceacross alargemodelspace. SystBiol .61:539542. RosePW,BiC,BluhmWF,ChristieCH,DimitropoulosD,DuttaS,Green RK,GoodsellDS,PrlicA,QuesadaM,etal.2013.TheRCSBProtein DataBank:newresourcesforresearchandeducation. NucleicAcids Res .41:D475D482. RyterJM,SchultzSC.1998.Molecularbasisofdouble-strandedRNAproteininteractions:structur eofadsRNA-bindingdomaincomplexedwithdsRNA. EMBOJ. 17:75057513. SahaSK,PietrasEM,HeJQ,KangJR,LiuSY,OganesyanG,ShahangianA, ZarnegarB,ShibaTL,WangY,etal.2006.Regulationofantiviral responsesbyadirectandspecificinteractionbetweenTRAF3and Cardif. EMBOJ .25:32573263. SalehMC,TassettoM,vanRijRP,GoicB,GaussonV,BerryB,JacquierC, AntoniewskiC,AndinoR.2009.AntiviralimmunityinDrosophila requiressystemicRNAi nterferencespread. Nature 458:346350. SayedD,AbdellatifM.2011.MicroRNAsindevelopmentanddisease. PhysiolRev. 91:827887. SegersGC,ZhangX,DengF,SunQ,NussDL.2007.EvidencethatRNA silencingfunctionsasanantiviraldefensemechanisminfungi. Proc NatlAcadSciUSA. 104:1290212906. ShenMY,SaliA.2006.Statisticalpotentialforassessmentandprediction ofproteinstructures. ProteinSci. 15:25072524. ShihP,MalcolmBA,RosenbergS,KirschJF,WilsonAC.1993. Reconstructionandtestingofancestralproteins. Methods Enzymol. 224:576590. SieversF,WilmA,DineenD,GibsonTJ,KarplusK,LiW,LopezR, McWilliamH,RemmertM,SodingJ,etal.2011.Fast,scalablegenerationofhigh-qualityproteinmultiplesequencealignmentsusing ClustalOmega. MolSystBiol .7:539. SongMD,WachiM,DoiM,IshinoF,MatsuhashiM.1987.Evolutionof aninduciblepenicillin-targetpro teininmethicillin-resistant Staphylococcusaureus bygenefusion. FEBSLett .221:167171. StamatakisA.2014.RAxMLversion8:atoolforphylogeneticanalysis andpost-analysisoflargephylogenies. Bioinformatics 30:13121313. SvecF,YeakleyJ,HarrisonRW.3rd.1980.Theeffectoftemperatureand bindingkineticsonthecompetitivebindingassayofsteroidpotency inintactAtT-20cellsandcytosol.JBiolChem .255:85738578. TalaveraG,CastresanaJ.2007.Improvementofphylogeniesafterremovingdivergentandambiguouslyalignedblocksfromproteinsequencealignments. SystBiol. 56:564577. TaylorDW,MaE,ShigematsuH,Cia nfroccoMA,NolandCL,Nagayama K,NogalesE,DoudnaJA,WangHW.2013.Substrate-specific structuralrearrangementsofhumanDicer. NatStructMolBiol. 20:662670. TaylorJS,RaesJ.2004.Duplicationanddivergence:theevolutionofnew genesandoldideas. AnnuRevGenet .38:615643. TiroshI,BarkaiN.2007.Comparativeanalysisindicatesregulatoryneofunctionalizationofyeastduplicates. GenomeBiol. 8:R50. UgaldeJA,ChangBS,MatzMV.2004.Evolutionofcoralpigments recreated. Science 305:1433. UmbachJL,CullenBR.2009.TheroleofRNAiandmicroRNAsinanimal virusreplicationandantiviralimmunity. GenesDev .23:11511164. vanHazelI,SabouhanianA,DayL,EndlerJA,ChangBS.2013.Functional characterizationofspectraltuningmechanismsinthegreatbowerbirdshort-wavelengthsensitivevisualpigment(SWS1),andtheoriginsofUV/violetvisioninpasserinesandparrots. BMCEvolBiol. 13:250. VeitiaRA,BottaniS,BirchlerJA.2013.Genedosageeffects:nonlinearities, geneticinteractions,anddosagecompensation. TrendsGenet. 29:385393. VoordeckersK,BrownCA,VannesteK,vanderZandeE,VoetA,Maere S,VerstrepenKJ.2012.Reconstructionofancestralmetabolicenzymesrevealsmolecularmechanismsunderlyingevolutionaryinnovationthroughgeneduplication. PLoSBiol. 10:e1001446. WangW,YuH,LongM.2004.Duplication-degenerationasamechanismofgenefissionandtheoriginofnewgenesinDrosophilaspecies. NatGenet .36:523527. WheelerWC,GatesyJ,DeSalleR.1995.Elision:amethodforaccommodatingmultiplemolecularsequencealignmentswithalignmentambiguoussites. MolPhylogenetEvol .4:19. WhelanS,GoldmanN.2001.Ageneralempiricalmodelofprotein evolutionderivedfrommultipleproteinfamiliesusinga maximum-likelihoodapproach. MolBiolEvol. 18:691699. WhitfieldJH,ZhangWH,HerdeMK,CliftonBE,RadziejewskiJ,Janovjak H,HennebergerC,JacksonCJ.2015.Constructionofarobustand sensitiveargininebiosensorthro ughancestralproteinreconstruction. ProteinSci. 24:14121422. WilliamsPD,PollockDD,BlackburneBP,GoldsteinRA.2006.Assessing theaccuracyofancestralproteinreconstructionmethods. PLoS ComputBiol. 2:e69. WilsonRC,TambeA,KidwellMA,No landCL,SchneiderCP,DoudnaJA. 2015.Dicer-TRBPcomplexformati onensuresaccuratemammalian microRNAbiogenesis. MolCell 57:397407. YangSW,ChenHY,YangJ,MachidaS,ChuaNH,YuanYA.2010. StructureofArabidopsisHYPONASTICLEAVES1anditsmolecular implicationsformiRNAprocessing. Structure 18:594605. YangZ,KumarS,NeiM.1995.Anewmethodofinferenceofancestral nucleotideandaminoacidsequences. Genetics 141:16411650. ZambonRA,VakhariaVN,WuLP.2006.RNAiisanantiviralimmune responseagainstadsRNAvirusin Drosophilamelanogaster Cell Microbiol. 8:880889. ZhangJ.2003.Evolutionbygeneduplication:anupdate. TrendsEcolEvol 18:292298. ZhangY,SongG,HsuCH,MillerW.2009.Simultaneoushistoryreconstructionforcomplexgeneclustersinmultiplespecies. PacSymp Biocomput .162173.ZhouR,CzechB,BrenneckeJ,SachidanandamR,WohlschlegelJA, PerrimonN,HannonGJ.2009.ProcessingofDrosophilaendosiRNAsdependsonaspecificLoquaciousisoform. RNA 15:18861895. ZmasekCM,GodzikA.2011.Strongfunctionalpatternsintheevolution ofeukaryoticgenomesrevealedbythereconstructionofancestral proteindomainrepertoires. GenomeBiol. 12:R4. ZorT,SelingerZ.1996.LinearizationoftheBradfordproteinassayincreasesitssensitivity:theoreticalandexperimentalstudies. Anal Biochem. 236:302308. ZwicklDJ,HillisDM.2002.Increased taxonsamplinggreatlyreduces phylogeneticerror. SystBiol. 51:588598. Diasetal. doi:10.1093/molbev/msx090MBE16
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID ER1OSHJRZ_5APWGP INGEST_TIME 2017-06-29T21:58:08Z PACKAGE IR00009835_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
|