so good afternoon everyone um uh my name is Elena 
stepanitis I'm a program executive in the chief   science data office at NASA headquarters and I 
am co-chairing the science Mission directorates   uh data and Computing architecture study with 
Mike little who's also here in this meeting and   um I'm really excited that we have two speakers 
from Microsoft planetary computer here to speak   with us today uh but first I just want to hand it 
over to Kevin Murphy who's NASA's Chief science   data officer for just a quick introduction to 
what we're doing here in the study and with open   science at Nasa hey everybody first really can you 
hear me Elena just give me a thumbs up okay good I   really do apologize for being late so thanks for 
bearing with me um but I really did want to come   here um and and uh kick this meeting off because 
I think you know um what we're trying to do um   uh uh is is really really going to be informed 
by by how we work with Partners like Microsoft   um and you guys are doing some really wonderful 
work um with the planetary computer so   um let's kick it off uh next slide please um so um 
we are uh you know conducting a data and Computing   architecture study um because we recognize that 
you know kind of traditionally the NASA systems   that we've been using need to be upgraded to 
really take advantage of the the high data volumes   and the new technologies and and capabilities 
that exist in in kind of the commercial world   um so um we're doing that in addition to uh or 
or in alignment with our open source science   activities that we kicked off about a couple well 
kick off a couple years ago um and and now really   is that critical time for us to really evaluate 
from a fundamental level um how we're we're   supporting open source science um with our data 
and Computing infrastructure next slide please um the scope of this study is really in 
everything that we do within the science   mission directorate for science it includes 
scientific modeling simulation activities   uh data processing primarily from you know uh 
satellites or or other non kind of earth-based   systems including Mars and everything else um how 
we take the data from Level zero to higher levels   How We Do analytics on that information 
and how we integrate a new uh process and   techniques or analysis techniques like Ai and 
ml to support those really large data volumes   um the capabilities that we're considering include 
we have a number of commercial Cloud environments   that we support we also have high-end Computing 
capabilities out at Ames in uh in the Bay Area   along with our scientific Computing capability at 
the cottage space flight center next slide please uh we try to do the uh one of the things that we 
have with with open source science is this concept   of open meetings or at least hybrid open meetings 
so that people can participate in how we uh   uh run our you know well can participate in 
how we develop our systems and and and that   will allow them to be better um uh positioned 
to help us collaborate with them in the future   um so um we do have a code of conduct 
because we do have hybrid open meetings now   um and those include you know being 
respectful um and considerate of how   people work evaluating the diversity be 
considerate and respectful uh communicate   openly with respect for others including uh 
making sure you critique ideas not individuals   um uh avoid personal attacks be mindful the 
surroundings and your fellow participants   um and alert us if you have any issues we 
don't really um no we don't it's not well   we don't really we do not accept harassment 
intimidation or discrimination in any form   um and uh verbal abuse and we have a 
number of examples of unacceptable behavior   um you know we have unfortunately run into 
this in the past where we have disruptions   um and sometimes we have to take care of this so 
um thank you for listening to this part um I know   that sometimes it can be a little cumbersome 
to go through it but I think it's really   important that we have these discussions up front 
especially as we have other meetings next slide so at this point I'm going to hand it over 
and let somebody else talk but before I do so   um uh before I send it over to Peter um 
I'd really like to say you know thanks   again for for being patient with me 
um as I came over and uh you know   I'm really looking forward to hearing 
and discussing what you guys are up to thank you Kevin um I'm Peter Williams I'm going 
to be helping just to moderate the conversation   um Fielding questions and and moving them 
to uh both uh Bruno and Tom our guests today   um if you're interested in asking a question 
or you want to see the questions that are in   the queue um uh there's the the best way to do 
that is to submit to NASA's i o tool um your your   questions go in anonymously um but you can upvote 
questions submitted by yourself but also others   um and uh I'll try and focus on the more popular 
questions but I'll also do what I can to link   questions that seem to be in uh kind of a similar 
vein you can use the QR code um and go there I've   also put the link in the chat um and if you prefer 
uh you can fall back to the WebEx chat I'll do   what I can to track those and uh either Hannah or 
I will grab those from the chat and get them over   to the i o tool um so again they can be a part 
of that same upvoting process next slide please well it's my pleasure to uh welcome Tom Augsburger 
and Bruno Sanchez Andrade Nuno who are going to   be presenting today they are both involved 
with the Microsoft planetary computer Bruno   is director of Microsoft's planetary computer and 
Tom is a geospatial software engineer part of the   project as well they're going to provide us with 
an overview of the project seen through somewhat   of an open source science lens hopefully we'll 
hear some about user needs as well as business   needs that will likely be relevant to NASA's smd's 
transition to open source science picking up and   building on some work that has been happening 
with SMG and elsewhere in NASA but also really   leveraging the insights and and experience of 
folks out outside of NASA including in this case   Microsoft we also are as I guess as part of that 
we're looking to hear about some best practices   that might be identifiable through their work and 
also some insights regarding data and Computing   architecture which ultimately is kind of a primary 
focus for this study we're trying to design or put   together some proposals for a design of the open 
source science architecture data and Computing   architecture that again will build on good work 
that's been happening to date but also set the   stage for needed work in the future as I mentioned 
Bruno is the director of Microsoft's planetary   computer he has a PHD in astrophysics and a 
rocket science post-doc he led Big Data innovation   um at the World Bank Innovation labs and served 
as the vice president for social impact at the   satellite company saddle logic and chief scientist 
at mapbox he has been awarded the resilient   science policy fellow of the U.S national 
academies of Science and also a young Global   leader of the world economic Forum uh welcome 
Bruno joining Bruno will be Tom Augsburger Tom   um is a geospatial software engineer I 
mentioned that working at Microsoft on   the planetary computer he's a member of the 
pangeo steering Council and a maintainer of   several open source libraries in the Scientific 
Python ecosystem including pandas and Das   with that it is my pleasure to welcome both 
Tom and Bruno and uh turn it over to you I mean there you go thank you Gary and 
Peter and everyone else it is such a   pleasure to be here uh we have we got some time 
so hopefully we can answer all of your questions   first I'll speak um more on the why we are 
doing this as director of the program it   means that most of my time is not spent on 
coding is spent on Outlook and PowerPoint   um making sure that then Tom and the rest of the 
team um can actually deliver our promise Tom has   the really cool title of your special architect 
and I understand that this is gonna be a lot of   the questions we will um we will get um and 
as you were saying in Peter there I'm gonna   answer my my presentation and while I do that 
as you were sharing I leave I understand most   of you who are in the headquarters of NASA in DC 
that's what I did my postdoc at the naval research   laboratory with uh with NASA funding um on the 
sounding rockets for the solar chromosphere so   it's kind of nice to be a little bit back in a 
way so so glad to be here and as I was saying   the structure is going to be we are going to 
tell you why we are doing this and I think   it's important because it's gonna hopefully 
show you that one of the key values is to   be completely transparent of how we are 
building this hey and then I'm gonna do   that presentation maybe half an hour or so we 
can have some questions on this why and then   a little bit of the how we are building it but 
then Tom is the one that has um is going to do   a more demo workshop and hopefully answer all of 
your questions as technical as as they might be   so let's go to those just from the start um when 
I Was preparing for this talk I saw the the goals   is to for the science mission directorate is 
to coordinate a cloud-based high-end Computing   to capture the the computer needs this all of 
this if you could just take copy paste and it   would be why or or when we Pro we propose to 
build a brand to the computer so in a sense   the best outcome of what I'm saying here is is to 
this financial today actually the the planetary   computer looks very similar to what we want to 
do so we're going to choose the same technologies   that you choose not because we chose them and we 
we did not invent any language we did not imagine   architecture the entire planet computer is open 
source and the choices we made of the tools we   use were because the community uh decided or we 
understood that the community decided that was the   way to go the standards we choose the Computing 
vitamins we chose is all based on that and it's   part of the reason we built it that way is to 
ensure that it is the the least amount of friction   between knowledge Creation with someone on the 
academic side to application of that knowledge   to operational dependencies because at the end 
of the day I think we all agree that a lot of   the things that you are doing and all the things 
we are doing are extremely critical to face issues   like climate change or biodiversity collapse 
but it's not saying what's happening is not even   putting numbers in those peer-reviewed articles 
is then figuring out okay then how does the   government use this how does clients commercial 
clients non-profits ngos everyone Civil Society   use this so and that's why we're building this 
completely in the open and completely Trying to   minimize those frictions I know some of the key 
questions that you send me some business needs I   already spoke to some of those I really want to go 
into the summary of why we're doing this and then   the the the environment where this comes from 
our needs or the business needs is that we see   pretty much everyone every single company and 
every single NGO and every single government   is looking at sustainability and issues like 
extreme events driving because of climate change   driving a lot of Damages or ESG and recommendous 
framework there is a lot of need to understand   what's happening and that in many ways means an 
extremely hard computational environment that is   hard to scale so our user needs is to make it 
as simple as possible but at the same time as   technology advanced as possible so you are working 
with the latest available data and the latest   available Frameworks so that's why we chose that 
the architecture we're going to saw in a second   we couldn't come back into these questions but 
as I said at the end of the day what we want is   to build this credible Earth and when I say we I 
don't mean Microsoft I think we as Society we as   the planet we should be able to figure out how 
to do this credible Earth how to ask questions   of what is where how much is there how much is 
changing what could be there what should be there   all of those things it is in our Collective 
interest we figure out how to do it and in a   way that is not is as open as possible it is not 
intent that we build this and no one else does it   is the opposite it is the intent that everyone 
knows how to build these things and of course   when it comes to large scales planetary scale is 
going to be hard for um normal entities to to to   service or to create this repositories so they're 
gonna go back to institutions like Europe or the   cloud providers like ours and we can coordinate 
and build similar things to the beneficial for   everyone these few slides is a little more on 
the the why I don't think this audience which   by the way is almost hitting 100 people so thank 
you everyone for being here on Friday hopefully   we hit that 100 number um and this is not really 
needed in this audience why we better than this   what I would mention though and then we just 
go into the planet computer is what we see   in environmental sustainability which I would 
are you you could also say for science itself   and a lot of understandability is underpinning 
heavily on science so that is not surprising and   when I say environmental sustainability thing 
called Soft science we are saying that it is   increasingly complicated that is increasingly 
recognized as a dependency an opportunity   for more stakeholders from governments to 
commercial entities and it's all interrelated   you cannot think of sustainability or science 
and not think about nature people or livelihoods   these are things that are happening and I would 
argue with those three points it's increasingly   complicated because we have where money where 
way more data and is way more complex we have   way more tools that are also more complex and 
have hopefully we have also more questions and   the questions are complex and all of these things 
mean that it's really hard to find someone or an   institution that is able to know all of those 
things who knows about Ai and also about your   special and also about sustainability I know 
it's about a preservation it is getting harder   and that's why focusing on making it as easy as 
possible putting together the people are experts   on these fields and combining all of the leads 
into the same platform is critical the second   point I said is that it's increasingly recognized 
as a critical dependency I'm sure this audience   knows about the sustainable development goals 
and there's the ones agree that obviously in   green art obviously related to sustainability but 
you could argue that pretty much every other one   is also related to sustainability from food to 
energy to infrastructure Innovation if you like   more talking about money well the world economic 
Forum has identified the top risks by likelihood   and by impact and half of them are also related 
to environmental sustainability so this is not   only about the moral call it's also about the 
risks for our socio-economic way of doing things   it is also seen as the amount of Attraction of 
interest and funds that this is getting there's   this uh quote from the economies that everything 
in climate understandability is hard everything   except a raising Capital there's more and more 
funds that are willing to invest in these issues   and it's great because we need all of this and the 
third point is that it's interconnected this is   um basically the top line of the highest report 
between the top expression climate change the ipcc   and the top experts on biodiversity at the ipvs 
and again it's not only just making that point   it's probably not needed for this audience it's 
it's very much interrelated it's not only about   nature it's also about people for example if you 
protect a section of the ocean the Overflow of the   species outside of the protective area yields 
more uh fish more Cuts than if we didn't have   a protection in place so natural resolutions and 
protective environment is not only the right thing   to do is also good for it's an opportunity for 
Buddhism for others so why is Microsoft doing   I'm going quickly through this because I know 
you are all interested in how we will deprive   computer but it's I think it's important to 
know why we're doing this Microsoft cares about   all of the sustainability commitments has four 
top commitments for 2030 carbon negative zero   waste water positive and building a planetary 
computer it's that important for us so it's not   that we want to build a planet as a product for 
Microsoft this is like other products is that   we believe there should be the Technologies for 
um for addressing all of these issues and that   is why we did the planetary computer so let's 
go into the meat of it ideally you go from data   to decisions and ideally you have some data to 
storage and you process it you do some analytics   you have applications and then you get results 
that's a very simplistic view because in the end   there's so many types of data there's so many 
types of storage decisions and locations and   formats and Analysis and you end up having an 
incredibly scattered not interconnected array of   both data storage and formats and and everything 
you could try to put everything together you could   try to standardize put everything in one place 
and then that should allow you to have more   frictionless process from data to knowledge and 
that is the planetary computer that's will build   it so we're building a foundational architecture 
for aquarable Earth as I said before basically a   digital twin this is something that is becoming 
more and more um well as a as a namesake for   what a lot of people are doing as a service and 
basically a planet where Cloud native environment   I should know that I've had this morning this 
same conversation with another public entity   that is the data provider um and I'm also having 
this income conversation with the European Space   Agency as they are thinking of destination Earth 
destination e and this is the the meat of it is   as I've been saying a few times already our hope 
is that we can share all these businesses we've   built the plant that computers and completely 
open source with open data and by doing that   I'm sharing what we're doing we we can coordinate 
and build things together because it would be it   extremely beneficial for everyone if for 
example we could share across formats and   data and minimize the work for that everyone is 
doing to do their work so the bladder computer   is four things you can think of them as 
abstraction abstraction layers we have a   data catalog which is basically file of files um 
we we try to keep them as as untouched as possible   but we we also try to have them as Cloud native so 
you allow it allows to range streets for example   if you have a big file you'll need to download 
the file to get a small section you can just   request the regular section in rasters this is the 
cloud optimized geotifs inductors and and polygons   is a bit more of a conversation we should have 
that's that's the idea put all the files in the   same Data Center and have this big pile of files 
we then ingest all of those files are into one   metadata database basically where every single 
byte we have 50 52 petabytes of of data stored   in the same this data center that's the data of 
the big pile of files every single byte of those   petabytes is stacked with the metadata of what is 
it from when is it from what are the cloudiness   or any other characteristics that they have so 
that you you the core of the blender computer   is that metadata API so you say give me data for 
Washington DC when it's not cloudy in 2024 and   this I know landsat or sending or whatever thing 
it is then separate architecturally from that on   the same data center we have a compute environment 
if you know find Geo or Jupiter lab that is it   that's it there is nothing special we do to that 
it's uh deployed panzero instance that we allow   we we have for everyone who wants to use it but 
also we have recommendations so and I think Tom is   going to cover this to deploy their own instance 
so they have full control of of of the compute and   there's no extra cost this is the running cost of 
running the thing if you deploy it if you use our   type environment there is no no cost for that 
I forgot to mention the metadata specification   is a stack specs positive polar asset 
catalogs is an open standard and we basically   um we have a PG stack at other ways with all all 
that stuff then the last layer is the applications   it's more or less what we do is also what users 
and customers do with that data they can connect   directly to the files they connected to or to 
the metadata API or to the compute environment   it's modular in that sense by Design so a bit of 
an architecture diagram that hopefully helps you   um understand what we're doing what we're doing 
is we have right now if I'm not mistaken on Tom   please correct me later if if that is not the case 
we've made the painful work of going to 84 sources   some of them NASA or USCIS or NOAA or Sentinel 
European all of those and got what is the in this   for now we are focusing on open data what is the 
opening that you have let's ingested let's put it   if it's not in the cloud format let's do all of 
that and we put in the same data center when we   do the ingestion we do the the we we use the stack 
specification to then create the database of the   metadata so the the green arrows to the right 
is the ways the context user can consume the   data so if you go through the authentication API 
it's basically just giving you a token to access   the blob starts there's nothing nothing anyone 
can access an account can can get an account   for this so you can get directly the the files 
if you know the file and you know everything   go to that if you want to use the metadata API 
you can also consume the metadata API asking for   questions like give me data for Iowa or whatever 
thing it is we also have the the catalog of data   which actually a website that consumes the 
metadata API to put it nicely and give you   some documentation and examples and things like 
that we also have a Tyler which is this is more   for raster which basically is a piece of um it's 
quality Tyler for those who know about it that   gets the data from blob storage to the metadata 
API and the files and they'll throws it to your   browser to the Internet so you can consume 
it quickly happy to do a demo right now but   the idea is that just the experience when you 
go to Google Maps or Bing Maps or any other   render map and you see the you see what data in 
this case you just select the metadata on the   characteristics of the source and the times and 
the metadata and on the fly in less than in most   of the time less than half a second you will get 
a map that looks like one of these online Maps but   it's rendered directly from Lab storage to that 
through this detailer um architecture that's the   the data API and that's the PC Explorer which 
is an application of the planetary computer if   you go to planetarycomputer.microsoft.com you'll 
see all of those things and then on the same data   center but separated as I said before we have the 
compute environment which is the panzero to give   you a sense of what the what we are doing right 
now it's like 52 petabytes of started probably   a month we went one and a half petabytes of output 
with a 2 billion calls to the metadata API that is   key that means that our users are not downloading 
files which is great they are only downloading   the little bits the little bytes of information 
they need for their requested so they get faster   outputs to do their work we get less output 
that we have to pay for and everyone's happy   that's the trick of this metadata which I believe 
is at the core of all of this architecture and we   also have the computer environment board that I 
mean CPU hours that deserves so this is only on   the one we managed there's a lot of clients that 
have deploying their own complete environment   the stack specification is key for that and 
there is this recent Planet a blog post when   they they go through that and they see that we 
are already the largest um database of metadata   assets which to me is okay great this is great but 
at the same time it also makes me think that huh   most of the assets all of these people have in 
their archives is the same everyone's indexing   on their own landsat everyone's indexing 
on their own sending it and it's not   I wish we could coordinate ourselves better and 
and share maybe a Federated stack environment   or network or whatever you want to call it 
so that not only you know what it comes from   like one of the things we try to put in the 
metadata API is the source of data what has   changed what's the provenance of the data but I 
think we could we could do much better I'm I'm   hopeful we can talk about this later um later 
on the call this is a little bit of the catalog   and explore I'm gonna I think at this point 
it's better if I just um so you quickly damn   a little demo when I change that I don't 
know P30 if there are questions so far I'm just going to the browser I'm saying again sir Microsoft Edge There You Go sir   see if you go to Peter you haven't said 
anything I don't know or is there any questions there are some questions um I'll kind of leave 
it up to you um let me let me do this demo   demo okay demo there we go on the 
questions.com as I said before   pilot files metadata API compute environment 
data catalog is how you see on the   um this pile of files every single this is 86 
sources of course landsat is one of them every   single collection has an overview and then has the 
assets we are hosting and then everyone each of   them has a notebook an example notebook that you 
can start so you can start working with quickly   um the providers the license if there is a paper 
related to that and then the spectral bands all of   this this is actually coming from the metadata API 
itself it's just rendered nicely on the browser   and as you see here we have this launching 
Explorer which is you can also get it here   um is what I talked about before which kind 
of combines everything and this is what I'm   um gonna just show you quickly you can go to 
Pakistan you know this has been floods uh recently   so I also know that radar sound radar it's 
um it's really good for detecting flooded   areas we we do process Sentinel one grb into 
radimetrical terrain corrected it's one of   the few data sets we also processed to create 
a product which is released openly and then   um this is the latest one this Pro as you can 
see already this is already rendered on the Fly   these are the results of the most recent Sentinel 
one since the the flood happened kind of a couple   weeks ago we can go back and we select the end 
date and these are the results this is the assets   that correspond to what I said here and I can 
click on each of them and I'll get the metadata   for those I get assets to do whatever thing I want 
I get the code to then copy paste and using my   um whatever analysis I'm doing but I also get here 
the code of the call itself to say hey what do you   have of this region these date ranges for this 
metadata specifications and you can copy paste   and then using in panzero and Jupiter lab and 
what you see in here is what I mentioned before   this is rendered on the fly as I move the mouse 
around and you see as less than you know half   a second I can also say I want to actually see 
what is the comparison between that now and what is four clicks or just changing one line 
of code and then I see that actually indeed   this is and you can plug in 
in and out that the situation   um in the massive flood that happened there you 
can go and do exactly the same thing for any of   our sources you can also then share this 
link with anyone so that they can just ask   as I know this is these days you can 
just very quickly share a status of the fast to do that I know there 
are others and I know that the   Explorer NASA has spent a lot of time on 
invested on on similar tools like this   what I really like of this one is this this 
frictionless or helping you navigate from   exploration to analysis with uh with this little 
tidbits of code um then the documentation as I   said before everything has document and how 
to read from stack every data set has its   own uh example notebooks there's also tutorials 
which I think Tom is going to cover in a second   but this is it this is the planetary computer 
it's a collection it's a special platform built   modularly on a pile of files or metadata API 
and a compute environment that is ready for   you to clone if you want and probably you don't 
want to clone 54 petabytes of data and maintain   the metadata and that's why it becomes a service 
for a commercial company like ours at Microsoft happy to answer questions the more technical side 
starts afterwards when Tom goes over that but if   there are questions of why we're doing this 
or the strategy of that happy to go to those   yeah wonderful I think that'd be great we do have 
a couple questions that really fit to the why as   you said um if you were thinking about either 
existing or anticipated user needs and business   needs that first question that you you showed 
that's getting a lot of attention uh today on   the list of questions we're wondering if uh which 
ones you would lift up and really highlight for   the purposes of this study as potentially guiding 
NASA's science Management directorate in pursuit   of data and Computing infrastructure especially 
infrastructure to support open source science I thought a lot about that because but I see more 
and more hopefully or thankfully is that there is   more attention to climate change and specifically 
to extreme events because they are the drive the   biggest driver of economy classes and um assessing 
risks of climate events climate change in these   weather events is critical and is something that 
is very much connected to science right because   if it's consumption of just your special data I 
don't think it's it's car correct me if I'm wrong   I don't think it's called to the NASA science 
Mission directorate and and to be the providers   of that data it's the data there you're producing 
it you are and people are using it fantastic   but when it comes that we are not there yet 
because we need better science and better   dissemination I think Climate Services climate 
reservations is right there and specifically   I think we have a lot of science already but this 
seems to be really hard to go from academic sets   settings and papers to then okay I'm company 
X and I want to assess my club address what   it is it's flawed is it droughts what is it how 
do I convert that knowledge of science into my   operations and that's why this idea that if you 
build this output you know flood risk where like   we did with one of the data sets Global fraud risk 
due to Tidal storm searches and sea level rise   then people are going to have questions it's 
great to have the data set we're going to   have a question how you did it what if you did 
it similarly so the idea that you could have   exactly the environment that produced that output 
ready to be deployed and the customer side or the   government or the city that is doing this and then 
adjust it to their own needs because they don't   assets or because they have a different source 
of data that is very powerful that those com   Technologies like binder which I don't know Tom 
you're gonna talk about it or not this basically   an idea that you can't deploy an environment 
and combine our complete workbench to do that   not only the code but also the infrastructure 
different structure as a code that to me is   extremely powerful and because that would scatter 
the need of anyone who's a bit strategic you will   forget the need of everyone who's going to answer 
to the regulations and disclosing these Planet   risks and this is something I would say the the 
market is very immature carbon options is the   same thing I think we like to say there is lack 
of meaning what it means these carbon offsets   or the additionality of carbon acids what is the 
measure of all of these things which is science   missing directorate I think core to help the 
world um answer what do they mean and what do   they measure and then mature those markets right 
if I had to pick one it would be that one Peter   great thank you uh maybe one more and and then 
I want to leave plenty of time for for Tom   um and I'm going to skip a question here that 
really I think Tom's probably gonna going to speak   to because there are some questions here that are 
a little on the technical um the side that I think   Tom is going to speak to but Bruno before uh you 
you um kind of close this this piece um there's a   question around issues that a successful data and 
Computing infrastructure may need to anticipate   um and related to the issues maybe opportunities 
are there any particular issues that you   would suggest NASA really pay attention to in 
thinking about data and Computing infrastructure   I think the principle of the fair principles is 
is to the core of what I know NASA's already doing   right they um in the open science one I I've seen 
many mentions both to to fair and also to Care   on on the indigenous communities and it's to be 
it's we right now it is not findable for example   um and it's not interoperable and there's no 
all of these acronyms are fair are not there yet   if you cannot fine divide unit easily if you 
cannot connect your whatever environment it is   you need to adapt to that environment because 
it's a close environment it's not a model or   a matter but it's not an open standard that 
is not interoperable so what I would say is   please everything you do make it an openness 
make it based on Open Standards and make it   I mean ensure that is findable and I would argue 
discoverability of metadata goes a very long way very good thank you Bruno um and thank you 
for that presentation that that well thank   you and I forgot to say that is also here from 
the team I think it's on the participant list   but I just wanted to shout out to her 
that she has also joined from the team   wonderful well welcome um and with that perhaps 
um can we turn it over to Tom sure take it away   let me do a few things uh in the chat which 
hopefully this will go out to everyone I just   put a link uh it's uh AKA dot Ms slash pc-nasa 
that'll take you to a uh Jupiter Hub that I set   up for this and I'm gonna share my screen 
and if you want you're more than welcome to   um go ahead and go there uh it's going to ask you 
for a username and a password and I've forgotten   the password but I think it is NASA I'm just 
gonna double check on a uh yeah so it's gonna   ask you for some username put put whatever 
you want there uh and then I'm pretty sure   it's NASA is going to get you in yeah that should 
do it um cool uh so use some unique username and   then uh the password again is NASA which I'll 
put in the chat as well once I find that screen   WebEx has rearranged my windows sorry 
uh where'd it go okay here we go chat   password is NASA while you're doing that thank 
you um I'm gonna show just a couple things uh just   as your uh stuff is spinning up I should mention 
briefly yeah so I mentioned this is a Jupiter Hub   um that I deployed for this uh it's a Jupiter 
Hub running on uh Azure kubernetes service so   like any other uh kubernetes service out 
there uh and then the idea is we're gonna   go through some kind of data analysis 
type workloads uh fetching some data   from the planetary computer and then uh 
just making some pretty pictures with them   um let's see a couple of things to mention uh 
you know Bruno mentioned the um the kind of uh   components of the planetary computer uh we have 
the data catalog so we'll be mainly interacting   with the stack API uh which is actually the 
same thing that this HTML page is generated   from but we'll peek at the data catalog briefly 
to understand what data is available I'm gonna   mention a bit of about kind of the setup that we 
have here as yours is is spinning up so uh the   the main idea with our compute is like we we don't 
really care how you do the compute um as long as   it's on Azure uh is essentially the big thing and 
that's less for like you know like a make money   reason uh that's like just the flat most efficient 
way to get to the bite so the bytes are all in a   storage container um it's like an S3 bucket uh in 
Azure blob storage uh the bytes are all there in a   single data center uh and if you want to have the 
fastest most efficient access to the data you're   going to want to put your compute in the exact 
same data center so happens to be in the west   Europe Azure region so that's the kind of setup 
that we're we have here that we're connecting to   uh so I'm here in my local browser uh you all 
are going to be in your local browsers kind of   on your own home networks or whatever Network 
you're on um and then we in running inside of   of Azure and the same region as the data we have 
this Jupiter Hub that we're going to connect to   um and that's so when we're doing compute 
when we're like downloading data from The   Blob storage containers it's going to have a 
nice high bandwidth connection here for those   large data sets and then as we kind of uh bring 
results back to our local client like a summary   statistic or a a an image a plot that's going to 
be much much smaller so that's okay to go over   the public internet we also have um dasc here 
Depending on time I don't actually know exactly   how much time we had so I'll be uh waiting for you 
to interrupt me once I go too long but um what we   have to ask here for scalable Computing it's just 
one of many many ways to do scalable Computing   um on Azure I happen to be most familiar with 
it okay uh and like Bruno mentioned I guess I   can put this in the Chas you know we have this 
Hub uh we're actually using a separate Hub just   because uh I didn't want to all of you to have to 
deal with like um uh signing up for an account if   you are interested uh we'll share the link at 
the end but um you can sign up for an account   and we can get you all approved but for now I just 
set up a temporary one that you can all log into   um okay I think that's it for introductory 
stuff hopefully uh everything's going okay   um I actually don't know if if you all can chat 
but if you are having issues then uh somehow no uh   alert me that uh stuff is breaking uh and I'll try 
and fix stuff but I'll assume uh some people have   successfully logged on um the just to mention 
like this setup here you know one of the nice   things about the cloud is the kind of pay as you 
go Computing uh scale down to zero so uh it might   take like a few minutes um I had it completely 
scaled I had this completely scaled down but   as you request a uh request a a pod a notebook 
server here it's gonna automatically start up   virtual machines and then uh then your thing will 
come in okay we got some time great thanks Peter   um excellent so you should be seeing I 
apologize for the unrendered markdown   um you're you're seeing a page that 
looks like this um we're gonna go   through a uh an example here start off with 
this one uh about uh using uh the stack API   um so if you all could click on the 
folder icon here uh by default you're   in the planetary computer is examples repo 
and then we'll go to Quick starts and then   you're gonna want I'm gonna make this just a 
tiny bit smaller and you want uh reading stack there's a uh that reading stack Dash R example 
we're gonna be using python uh just since that's   the pipe the environment I have selected here 
um but uh stack is uh well as we'll see it's   a cross language um standard okay hopefully 
things are working for people I'll assume it is   um great so like Bruno mentioned uh we have all 
this data in in Azure blob storage which is great   you know it's a lot of work uh getting that data 
there um great help from from our partners you   know helping us with that um but just having the 
files there in Blob storage is not enough we think   um it's still too difficult to use that data 
um just you know thinking about like the simple   example of like give me all the um all the landsat 
images over uh Washington for 2020 you know like   you have to be very familiar with like usgs's 
particular naming scheme for like how it does uh   the wrs path and rows and and like the various 
levels and and processing and modes and things   like that to to kind of figure that out and so 
what we use to to avoid that pain what we use   instead is stack um so stack it's this uh spatial 
temporal asset catalog it's an open specification   for cataloging spatial temporal data uh kind of 
previewing I saw a question about you know is uh   Planet well planetary computer focused on stack or 
sorry on on just Earth data or can it be used for   other like lunar or Mars um people have uh kind of 
hacked up stack to make it work for other uh other   bodies so uh I don't know exactly how that works 
I think they have to do some uh hacky things with   uh coordinate reference systems to make this work 
but in principle it could but currently all of our   data is Earth focused right now okay um so we'll 
go ahead and go through this so the planetary   computer stack API um we're going to be using pi 
stack clients and the stack you know it's it's a   whole standard for how to catalog spatial temporal 
data but by far the most useful thing it does is   lets you search that data lets you actually 
query that so that lets you do things like in   this case we are looking I think it's yeah areas 
around Microsoft's campus in Redmond Washington   uh so we have that bounding box there and we're 
interested in in scenes from 2020 and you know   that that query that I was posing earlier becomes 
pretty straightforward to write in this case we're   using python code to do that so we make that 
search and we get back the kind of eight items   that match our query for um landsat scenes over 
Redmond Washington in December 2022.

Yeah if you   do have any questions uh throw them in the chat 
and I'll try and answer them as we go through them um so we got those eight items matching it so uh 
and you saw it returned really quickly like less   than a second or two um so we haven't actually 
loaded any data what we've done is is use the   stack API to kind of query the metadata for 
scenes that uh match our query if we look at   those items they're geojson features so they have 
things like a geometry um you know all the other   stuff that you'd expect from a geojson feature 
and then it's expanded with a whole bunch of   other information like what is the platform it was 
on what date time or date range was it captured   for projection information all the all the useful 
things that you would need to work with this data um including things like uh the data 
provider in this case uh USGS includes   um an estimate of cloudiness for each of 
these uh scenes and so you can do things   like very quickly filter out uh uh cloudy 
images select the least cloudy one here just capture this but uh stack I I should 
mention actually stack uh the idea is it's   um it's all a metadata standard that's all about 
linking out to the actual data uh so the the items   that we got there's one or more assets on each 
item and each asset is an individual individual   file uh in this case in the case of of landsat uh 
collection two uh their collection two level two   they're all going to be um well most of the the 
interesting data assets are the cloud optimized   geotiffs that we can really efficiently access 
you also see things like the uh metadata files   uh linked there as well one of the assets that 
we can use here is the rendered preview asset so   you can see it's a link and actually to a data 
API so this is actually the same thing powering   the Explorer that Bruno used so that explores um 
using the stack API to query you know it's looking   at where your window is over Pakistan getting 
the bounding box in latitude longitude using   that to make the queries and then the data API is 
responsible for returning the actual images that   match your search so in this case here's our 
our least cloudy item over Redmond Washington we can also access the data after we do one thing 
we need to sign the data so uh if you're the the   actual assets themselves are in private storage 
containers uh just to kind of keep an eye on   egress but we do allow Anonymous signing of tokens 
so you all didn't sign up for planetary computer   accounts yet we didn't provide any kind of API 
key or anything here all we need to do is make   a request to the planetary computers uh SAS API 
that gives us a a token that we can use to read   the actual data okay so we'll assign the item 
or make an HTTP request in the background that   makes an HTTP request to the SAS API and it gets 
us back a um I can show you it signed href a URL   that has the typical stuff for Azure blob storage 
so this is a storage account container the path   to the cloud optimized geotiff and then everything 
else is the um is a read-only token and so now you   can pass this off this URL off to anything that 
uh can read data over HTTP so in this case we're   using Rio xra to read it into an x-ray uh data 
array we can also use things like qgis um rest   area uh R uses uh gdol it's something built on top 
of gdol I think it's Stars I can't raster I can't   keep up to date with our community but anything 
that can speak HTTP can now access the data   and the last thing uh I think last thing uh 
oh the scan through worth mentioning is that   um you can really efficiently uh make data cubes 
out of uh out of these stack items so the items   themselves have enough metadata that we can uh 
kind of um Mosaic them together uh in in space   and stack them through in time uh and we have 
all our bands here to very quickly create the   data Cube uh which I I think is just great so 
like we've gone from you know thinking about   low level details like what is the exact naming 
scheme that USGS uses for for these files and   like how can I lay them out in space and time and 
read them and understand their spatial extents we   don't have to worry about any of that instead we 
can use these higher level things like searching   by space and time and things like cloudiness and 
get back a bunch of Rich metadata that describes   the assets and based just on that metadata we 
can get these really nice convenient high level   libraries data structures like a data set that 
can we can work with to actually analyze the data um I'm going to skip through most of this just 
for just for time because I do want to get back   to questions but you can search on additional 
Fields like cloud cover so this is going to   vary data set to data set we'll see an example of 
doing this in the next uh next notebook and then   it's worth mentioning uh yeah I'm gonna skip 
through this uh all all stack metadata stuff   um and then I just briefly wanted 
to show that stack works with   um uh it isn't specific to Cloud optimized 
geotifs or raster data it also works uh for   um well yeah stack only cares about about 
files so it's all about linking to assets   um so in this case we're we're using a daymet 
uh data so daily North America daymet data   um and we can have a look at the link here it's 
a link to a file in well actually a directory in   Azure blob storage uh that is a czar store 
and so we can go ahead and load that up   very similar to how we did the other other example 
so stack uh it's a very flexible metadata standard   um for the most part it's it's really just 
focused on on spatial temporal data if you   have some some data that has like a spatial 
footprint a temporal time stamp or range then   stack is a great way to catalog it okay I'll 
pause there if there are any questions I can   answer now and then we'll jump on to a kind of 
more fun example that'll take 10 or 15 minutes   very good and there are some questions 
um folks I think have have been pretty   interested in some of the the 
details that you've been covering   um but let me take a step back um actually and and 
I think there's a question here that's a little   broad that you'd be able to speak to really 
nicely and that has to do with best practices   and are there particular best practices you might 
highlight for an open source science architecture   um data and Computing infrastructure for open 
source science or open source science operation   um yeah that's awesome uh let's see so I would 
say um my background is a in open source and   it's like you know woefully under maintained open 
source maintainers always yeah super stressed out   burned out so I would say um as much as possible 
and I understand people are busy uh but as much   as possible be involved with the open source 
uh uh libraries that you're building on um and   you know I mean yeah it can be hard to justify it 
especially in the short term uh spending time like   on open source not necessarily even uh you know 
working on features or whatever uh but just being   involved with the discussions there uh I think can 
be super valuable both for you because you like   understand where the projects are going uh but 
also for the community sharing your feedback as   a a user who's like trying you know applying this 
in practice at scale like NASA is doing um can be   super valuable and then the other thing is like 
um the hardest part about open sources you know   you can do anything but the most successful open 
source stories I've seen have always been around   uh people individuals who bring groups together 
and can coordinate um coordinate groups who might   have uh you know different uh priorities but I'll 
have some sort of shared um some overlap in those   priorities that they'd all benefit from coming 
together so uh yeah be involved and then as much   as possible be especially involved on the kind of 
coordination and uh coordination side of things   great um so kind of building on that let me shift 
to existing approaches to open source science   um the data and Computing infrastructure 
to support it any particular approaches   that you would see um suggest or really 
want to flag for uh the NASA work here   um yeah I think the pangeo is is like the 
go-to example right like a group of people   who are just trying to do geoscience on the 
cloud uh hit upon this idea of a Jupiter Hub   deployment in the cloud uh using kubernetes 
um that scales with desk uh and that you know   that was like uh I think the initial thing 
was hacked up in a weekend by a few people   um at some I can't even remember which uh 
conference or Workshop it was but uh that idea has   like uh you know gone a long ways to to the work 
that they've been able to do um and so that's like   what we do here uh with the planetary computer 
Hub that we provide and then lots of people are   deploying their own hubs you know to customize 
the software environment or you know various   things around around that so I think that you 
know that is like a good go-to example that said   um I guess yeah so the benefits are you don't 
need to expose every single individual to like   um let's say challenges around like uh Cloud 
subscriptions and like how do I get the billing   details right and like which service should I 
use you you prevent that present them with a nice   login an easy way to to scale their compute um 
that said like there are tons and tons of services   um that go beyond like the core uh you know a kind 
of interactive Computing environment that Jupiter   have handles so well um and you know it's it's 
strength and it's it's weaknesses like there's   fewer options around like uh job scheduling and 
you know batch workflow type things within Jupiter   Hub itself how do you complement those with other 
other open source or Cloud Technologies um that's   kind of a different uh yeah different can of 
worms maybe so I don't know that's uh yeah PTO   I think is a good place to start great thank you 
um maybe one more question kind of in the similar   vein of the of the previous two um you've worked 
on open source science you know reproducibility is   a is a key aspect of Open Source science 
um how is reproducibility of results   um and by extension decisions supported by the 
Microsoft planetary computer uh yeah I think so   there's a couple of answers like at a very surface 
level we can say it's like you know ideal right   you can have a notebook you can uh you know share 
a link to it um like Bruno showed we have those   examples where you can you know click a button 
launch it in the hub and be off and running and   like yeah so at a surface level uh and I I don't 
want to undersell like that's that's not nothing   that's a a good accomplishment a good first step 
but it is just a first step if you want to like   fully lock down like the software environment 
and like their services on in the background   uh that are that are potentially being used that 
like aren't necessarily encapsulated in that link   that shareable link uh there's the the data and 
like what happens if uh you know USGS decides to   reprocess some scenes uh what happens to the data 
do we update it to follow USGS do we make a new   version you know how do we do all that so like I 
think there's tons and tons of questions around   um some of the trickier problems around 
reproducibility that like I don't think   we or anyone else have a good answer for um 
if you're interested in this I'm uh I'll I'll   bring it up here uh there's a interesting uh 
discussion working group forming around this   on the pangeo discourse uh let's see if I can 
yep uh I'm going to post this in the chat that   I think is a good summary of of where things 
are um where things are at this one's uh maybe   a bit more focused on like education but I think 
that's like a prime example of reproducibility great thank you um and I know you have a 
couple more slides uh that you wanted to   get but let me ask one more question here um 
that is really kind of kind of fascinating   thank you to the folks who put it in um if 
we were to share NASA data and software on   um the PDC um how would this broaden accessibility 
to communities who ordinary ordinarily would not   be able to use NASA data and tools yeah I 
think the the biggest thing there is is the   um I guess there's a few things so first of all 
there's just like the the bandwidth question is   like if you just have the data um on some server 
FTP server HTTP server or whatever uh even if   it's publicly accessible uh bandwidth can be a 
challenge um especially at scale uh so if you have   uh data that is in the cloud uh then there's at 
least a potential uh for for anybody to to use it   um because there's that option for locating the 
compute with the data um that kind of puts the   question a bit to like how do those people get 
access to compute and like with the planetary   computer you just sign up for it uh similar with 
lots of other services so um I think like that's   a a good first step towards uh towards broadening 
that access I think Bruno might have something as   well you're happy to to chime in here I'm actually 
now at my mom's place and it's almost a dial-up   connection if I didn't have the planetary computer 
I would need to go do I know to one of the   providers download the file open qgis on rdis and 
it would have taken me I don't know three hours if   at all I could do it versus the 10 seconds that 
I did so I think people with slow bandwidths   over the cloud is far away still can access this 
is because a lot happens most of it happens in   the cloud itself not in a closed sense but in 
a helpful way the other the other side is that it becomes then Microsoft's own interest 
now that we host the data we have a better   interest to make sure people use it and if 
you think of all the clients the company our   company has is a tremendous platform to 
make sure that all the data sets that we   host become used because now our incentive is 
for them to be used one of the things there's   a lot of datasets we could put on board 
and the the criteria we have so far is   Art is useful and because we don't know how to 
message they are useful or not is are they used   so it is our metric of success to 
make sure that these are used and we   our field teams are there's tons of people who are 
now trying to figure out hey who can use this data   so I guess a long way to answer that it becomes 
our incentive to make sure that this data is used yeah thank you for uh jumping in on that question 
Bruno I think that additional perspective is is   really helpful um and and I love the example 
of your mom's house right now I mean those are   those are uh really important comments and we 
have a couple folks in the in the chat who are   agreeing with that um Tom I know you had a couple 
of other things you wanted to to cover why don't   we turn it back to you for that sure and I'll try 
and be quick just to get to some more examples uh   or sorry more questions um if you jump back up a 
level so you were in quick starts if you go back   up a level uh we'll go to uh tutorials and then 
there's a fun uh well uh pretty hurricane Florence   animation example so that's again under tutorials 
and then this hurricane Florence animation uh you   can check out what we'll be making but we'll 
actually do that uh live here um yeah so the   idea behind this one is um based off this example 
from pie troll if you've used that Library um uh   it's loading some data from goes uh some mesoscale 
data from goes to visualize hurricane Florence   um so first of all you kind of have to figure 
out where the uh where the date the storm was   at and when so that's what this call is for 
which uh hopefully uh this one's kind of   taken a while uh hopefully we get the data set 
downloaded uh we do not yet have this data set   this uh best track data set in the plan third 
computer so we have to hit Noah's servers for it   um which maybe this is kind of 
a demonstration of why having   um all the data in the same place is a good 
idea if this fails I do have the um the latitude   longitude stored in another notebook so I I might 
uh I'm gonna bring that up if this fails entirely   then at least I'll be able to to do it and you all 
can copy the latitude longitude but I gotta set   up another thing for that I'm gonna assume that 
this failed and interrupt it maybe that's a bad   idea yeah it's just downloading the data okay well 
I'm gonna oh wait a sec uh while this uh comes up and then we can avoid hitting these servers 
that's the other nice thing um about uh Azure   uh blob storage any blob storage service really uh 
is that they're built to scale uh built to handle   many concurrent requests uh so you don't have 
to worry so much about a single user a few users   um a few users uh knocking the service over 
uh like we appear to have done to Noah oops   give me one sec while this comes up I'm just I can actually show this over here here's 
my other uh uh my other Jupiter Hub the real   one where I have an example from uh Noah's edmw 
workshop and I'm just gonna copy paste this over to this window okay so you all can skip 
this uh skip this example where you   download this stuff and instead so skip 
cell uh what is that two I guess it was   and skip this one as well and apologies for this 
uh I should have planned ahead we're gonna skip   all that stuff and we're just gonna skip to 
get the imagery perfect and y'all will need   I'm gonna post it in uh in the chat here we'll 
see how badly uh this gets formatted seems to   be okay hopefully the quotes are all like real 
quotes and if you want to follow along otherwise   I'll just go through it pretty quick uh just for 
time uh sorry about that uh but we somehow have   magically discovered the the bounding box uh in 
the date time for where this storm was uh when   um so we're going to go ahead and uh again now 
that we know where it's at we're gonna query   the planetary computer stack API but we don't 
have to know about how goes organizes its data   its file names or things like that all we need to 
do is uh query the go CMI uh database goes Cloud   moisture imagery collection for assets within 
this bounding box over this date range and we're   interested in just the mesoscale images so goes 
is capturing uh conus and full disk images kind   of at the same time we only want the meso scale 
from when it was zoomed in on Hurricane Florence   um I don't think I kind that but you can 
see it's already finished so it's a couple   of seconds and we've got back these these 
items that match our query and if we very   quickly kind of check and and make sure that 
we're in the right spot you can see make this   a bit smaller so we can see it you can see that 
we're in the right spot okay um let's see goes   does not have a a sorry green band I think it 
is so we're going to do a bit of X-ray stuff   to make a synthetic green band out of the uh 
near infrared red and blue gonna do that here and then a bit of work to kind of like uh 
well a bit of work to make the picture look   pretty I don't know how scientifically 
accurate this is but some kind of gamma   correction to get a Time series of 
RGB arrays that we can then uh plot   um I'm gonna just very briefly um show this so 
if you copy paste this Ur to the desk dashboard   URL uh this is an example of uh Computing on 
the data in parallel um so we're Computing in   parallel on a single machine uh using I think 
for Threads or processes um dasc uh the setup   that we have here is a Das Gateway so you 
can easily scale on a cluster of machines so that's uh kind of uh working through this 
computation here reading data from blob storage   um doing the the linear combination to make that 
green band doing the stacking things like that   and then we have a bit of matplotlit a lot of 
matplotlib stuff here to to make the animation   and then embed it in the notebook um I'll just uh 
stop there and then we'll go back to the original   uh animation up top and and play that so this 
is um it's actually a bit longer than what we   were making there but um yeah hopefully as uh 
well it looks really amazing I think hopefully   it's like scientifically useful and can be I 
don't know you all can tell me whether or not   that scientific scientifically useful or not but 
I think it's pretty cool okay again sorry for the   the issues there with the the NOAA servers um I'll 
have to send them apology note afterwards uh and   get that data set onboarded so with that I think 
we'll uh jump back to questions if there are any   uh yes there still are some uh but thank you 
for that uh demonstration uh it certainly   does bring up just you know the speed 
that you're capable of doing that and   also the you know in a sense the the AHA 
wow at at the end um is quite helpful um   in a sense going back to that um if you 
were so so you've got this you know amazing   tool that you can work with um how would 
you suggest going about to design a data   and Computing infrastructure that's capable of 
supporting the principles of Open Source science   um NASA has you know submission needs around 
open source science transparency accessibility   inclusivity reproducibility which we talked about 
earlier but this question is really about how   would you go about designing data and Computing 
infrastructure to support those Mission needs   yeah um yeah so there's definitely like the kind 
of you know low level things of like co-locating   data with compute and uh you know cloud or wait 
I shouldn't assume Cloud but like uh efficient   uh access to data um which are you know important 
but uh I think even more important uh than that is   is really um well-structured standardized metadata 
and so for the uh planetary computer we're using   stack I know uh I think it's like NASA CMR also 
uh has some stack uh things uh I don't know the   full details but uh there are people at Nasa 
who are familiar with stack and involved with it   um so that's great and you know having that 
metadata makes the data um actually like   searchable queryable discoverable by by your users 
so that's been extremely important for us I think   it's important for any um kind of collection of of 
data sets um and then I think uh maybe even more   important that is like the educational material 
educational side of things so with the planetary   computer and and I think you know similar for 
NASA maybe we're in kind of an interesting spot   of there's a tension between you know do I I 
have this tutorial for for making this animation   um of hurricane Florence you know would that 
be better suited for you know going in like   the matplotlib uh gallery or x-ray you know for 
for all of its stuff like we have all of these   pieces these open source components that we're 
building on that we're bringing together for   for the specific use case um and it can be hard 
to know or how do you balance like improving the   documentation and examples for those libraries 
those components that you're building off of   versus like building your own thing um so that's 
like a a tension that we're facing and I think uh   NASA would face too as I'm guessing you all have 
uh a bunch of documentation that's like specific   to your your Computing your your data analysis 
platforms um that may or may not be I think is   open source a lot of it um and so like how do you 
balance that versus improving the documentation of   the Upstream um the Upstream uh libraries and then 
just like there there's absolutely that need for   credit cross-cutting high-level examples that use 
all of these things and so where where does that   go so that's like a thing that we've been thinking 
about and I think not completely solved yet wonderful and yeah it's really helpful to to 
know the kinds of things that you're running   into and saying hey we haven't sold them yet 
um you know that's always good for uh all of us   to keep our our eyes on um are there particular 
advantages and I apologize if this seems like a   loaded question but particular advantages that you 
see uh when you think of the planetary computer   um over Google Earth engine for example uh yeah 
yeah so uh advantages and disadvantages for sure   um I so that a lot of the advantages and given 
the Forum I think it's fair to say like uh open   source is of interest to like me personally 
to the planetary computer team to uh to you   all presumably since that's like in the title 
of this uh session so uh you know the planetary   computer is built on open source components 
it is open source itself like all of our   um our our stack API and metadata Generation 
all that things uh The Hub deployment if you   really want to look at that AKs deployment that's 
all open source and so that's like a uh I think   an important component which gives you all the 
flexibility if you know uh this example is using   uh Python and x-ray but if you want to use R and 
uh sits and all these other libraries for doing   your analysis then then absolutely uh go for it 
and like I mentioned lots of disadvantages like   Google Earth engine I'm not going to throw any 
shade it's a really amazing product they do a ton   of things really well um so yeah Bruno I think I'm 
guessing you want to say something here too they   basically use it first of all Google Earth engine 
is a fantastic product that has has help Advance   tremendously what we can do with remote sensing 
for like what 10 years so now throwing any say to   them it's it's a great product as some said some 
differences or disadvantages the way you want to   call it I like to think that if they had to build 
it today it would probably build something very   close to what the player the computer is today I 
don't know maybe they have a different answer but   um they did it did not exist many of the things 
that we are using now did not exist and that's   where they had to build it back then some things 
we also like to highlight with I think answer some   of the questions is that if if there is something 
we are not doing that you want to do you can   because it's operating it's modula hey there's 
this data set you don't have and I really need   it put it put it in the same data center in your 
own tenant stack ingest it with the stack and it's   gonna be 100 the same as if it was ingested by us 
that covers also some questions that I saw on the   list if you can also use it for Mars or the moon 
you will need to hack a little bit testification   I think there is I saw a talk at first 4D the 
conference that we talk about is people using the   stack spec for other planets it's doable but again 
because it's open source then you can just if we   don't do it you can do it it's exactly the same 
you know what's going on there's no black box here very good um Bruno one of the questions here I may 
may be something that that you want to speak to   um uh very directly so NASA has a space act 
agreement with Microsoft um can you comment on   a possible way forward for collaboration between 
NASA and Microsoft on the planetary computer we are already we have some conversations 
with some of your colleagues to figure out   how to to how to leverage that coordination 
uh into hosting or to doing projects together   on Pilots we I would say we are if you 
have something specific in mind happy   to take it on and have another threat 
but we are already doing some of that wonderful thank you there's another question 
again it's kind of you know the the basics how   does Microsoft fund and sustain the activity 
and that's the golden question and I think   it's also a golden question for NASA um 
and I think also gets into is NASA wanting   to be in the business of disseminating 
all of these data products for everyone   and I think the answer is probably not if there 
are commercial customers who want to depend on   this data set there's probably an opportunity 
for a company like ours or other Cloud providers   to say will host it and we'll provide it for 
you of course we it we depend on you because   we are the providers but we will then cover 
the the elasticity and the and the one-to-many   um needs and that's a little bit how we think 
of it when you use the PC Hub defines you that   we just thought about which we think of it as 
a reference implementation we think of it as   academic we think of it as NGO use but if you 
are a commercial company who's using the planet   the repeater would really encourage you to deploy 
your own computer your own panzero and that means   that you will pay for that there's no extra cost 
for for use in Pangea because it's open source   but you will generate consumption so it becomes 
part of the offering just like you can deploy   it on a Linux machine in Azure and then we it's 
it's part of the business model of the cloud to   to have actually the majority of the BM Linux 
and it's it's a business model on that that's   also when I made a comment before that then it 
becomes our incentive to disseminate and make   this data useful because if it's not we are not 
in the business of archival for archival sake we   can't right we got in the business on on figuring 
out how the the resources we're putting in to pay   for uh this stuff is leverage into more revenue 
for us and I genuinely believe that that is the   case otherwise they wouldn't exist Solutions 
like this one or Google Earth engine and others very good um there are several folks 
who who are interested in collaborating   um on on specific topics um we've heard about the 
you know looking at other planets Moon Mars um uh   Etc um I'm interested in putting space weather 
data on the planetary computer one person says   do you have any thoughts on how the capability 
would be useful um as far as again looking up   or out as if uh as opposed to um yeah looking 
more down we're gonna need to change the name   to you know Cosmic computer or something yeah no 
it's um the answer is that is that if you want to of the metadata I think it's going to be 
a bit tricky to do that I don't think it's   impossible I think that happens is the 
majority of developers are looking down   from satellite style right but I 
see no reason to when I was doing   um solar physics I'm losing my PhD we use also 
some of the tooling that was meant for Earth   to for the mapping the the surface of the Sun I 
would say let's do it and if you have questions   put it on the on the discussions we'd love 
to see that's the power of being open like   we haven't thought of that use case just go go 
through it and if you don't if you cannot use the   stack of specification that's fine just just any 
other specification that's also the video being   modular if you put it in the same Data Center 
and then you want to use the PC app do it it is   meant for these use cases but we love to figure 
out how crazy hacky things people do with this very good so it sounds like you know it might be 
a an interesting opportunity to kind of explore um   very good so we're getting close to the end of our 
time um I I want to make sure uh we kind of close   things out uh nicely here to make sure that it's 
three minutes left please stop us oh my god let's   pause for a second and uh have we answered the 
question you asked us if not happy to try again foreign I think um as I've been listening to it I think 
you've done a really nice job of of speaking to   those um in a really um uh very spot on um and 
and thoughtful and and also succinct way um   which is always a challenge you guys have a have 
a lot going um so I I really appreciate it myself   um and uh I know I would just offer an invitation 
to to anyone uh whether it's on the on the panel   or or the audience um if there are some 
follow-up questions uh we'd be um Happy um oh I'm seeing we we still 
we still have an hour for do we   um let me just ask ask openly Hannah am I wrong 
I thought we had I thought we were closing at   the at the bottom of the hour do we how much time 
do we have probably to hang out did we did yeah   today so today was a longer discussion so we 
have until 4 30 today if we want to I went   to facilitated discussions my apologies we have 
plenty of time um yeah so anyone who was thinking   of a follow-up question now is a great opportunity 
because we're not going to let Bruno and Tom   um directly Tom I think that was there that 
compute on dusk yeah I was just gonna say that but let's let's be candid and let's 
be open we have plenty of time   so I've got a question please this is Kevin um so you know we we see a lot of not a lot 
like like one you have a pretty um pretty   impressive system right um and you know I think 
that there are a lot of potential data sets for   these to address these types of questions related 
to earth science or applications or climate   so my question is is like how do you 
knit that together right like so the   the whole conversation here is like 
how can NASA like be internally better   but I think part of being internally better is is 
making our systems work better with other systems   um in an interoperable sort of way right I don't 
know if ever everybody's ever going to be able to   put all their data at one spot and do all the 
analysis in one location so as we work with   Issa you know we work with NSF we work with NOAA 
USG all these people with with you with Google   like like how do we make that um a little bit 
better it's a really good question giving and   maybe that's just that no it's maybe it's good 
to have the most used data for the most amount   of people kind of like a CDN cash on one place but 
there's gonna be data that I don't and that's when   that idea of Federated stack um like a ring of 
servers of the stock endpoints could be helpful   because you might be searching for something we 
don't have but if we have the metadata one it   gives you already the lead of where to go maybe 
the answer is not the bytes maybe the answer is   email Kevin or maybe go to whatever another page 
not that one but you can buy one right that maybe   it's I I dream that for example if NASA were 
to provide the the data along with the stock   specification dub like static file would probably 
make our lives much easier right Tom so we don't   have to to make the schema ourselves if we also 
have an API with the data you might not even have   the the connected to the data itself but if we 
can then if we get a photo for imagine we are   asking for something like that we don't find it 
we might trigger ourselves to your API and maybe   the answer is that we have they have it it's not 
online but they have it somewhere else that kind   of coordination among data providers and Cloud 
companies it's probably beneficial for everyone   so we get the most blunt a demand for requests and 
then you only get the ones that are specific to   the more Niche applications or the most needs data 
sets that we it's harder for us to host so no I   I I'm gonna I'm probably gonna speak out 
of term because I too do Powerpoints and   not technical stuff anymore but um I do think 
uh that the API reference for CMR has a stack you know description in there so you can 
you might want to take a look at that   yeah it is uh it yes that is correct it is a 
bit out of date uh I don't know okay yeah it's   a little update uh not quite up to stack 1.0 
but I'm guessing people are working on it yeah   it's it's great and it's fantastic to have 
but but I think the bigger question that I   have is like Okay so we've got stack catalogs 
and this type of thing and that type of thing   um uh but you know I think I think uh like how we 
coordinate that strategy across the organization's   is an important point right like like you know 
um you like we're we're talking to you but who's   talking with you and me and Noah and so you know 
what I mean it's like that coordination activity   to say hey look maybe this is the way we should 
structure this ecosystem a little bit you know   not necessarily be prescriptive but but to give 
some options would be you know something helpful   we one thing we do is be opinionated we don't shy 
away from being opinionated and we are Microsoft   you are NASA if you NASA is opinionated it would 
help lean the weight on one particular direction   we decided that stack was a good uh standard that 
the community was using and now it's not only the   community it's also Microsoft that is putting that 
together which probably by us the the lack of that   specification to get even farther if NASA then 
also Embraces that it goes in that direction so   I think maybe if we are a little bit more 
opinionated at the risk of um of some some   other data sets that might be harder to put 
in stock might help increase disability but   as I said it's a trade-off um it's it's hard 
and this is what becomes a little bit tricky   it's hard to cover everyone's needs I do know 
that cloud optimized your tips are great for   some things are not good for others or the 
geoparque is root for things for others so   yeah sometimes we have really good discussions 
and sometimes it's comfortable discussions of   choosing a winner on this open source 
standards but I think it's still worth I was wrong about the stack somewhere it's not   out of date it's been recently 
updated apparently so fantastic very good thank you um we still have a couple of 
questions that have flowed into the to the chat   here um one that's kind of a follow-up to to 
uh one that came up Tom while you were talking   um it talks about NASA's earth science data 
to correctly use NASA's earth science data   for research it's really important that 
researchers are familiar with the product   documentation and that users are aware of 
the Quality fields and values as well as the   product metadata product version that kind 
of thing what approaches are you using to   make this info information easily findable 
and accessible for the users by the users   um yeah uh super important um and as we like uh 
add these data sets we become pretty familiar   with all of them and I'm consistently amazed at 
how complicated uh and just like intricate each   of these data sets are uh anyway uh it's it's 
I think there are are really our only answer is   to do tons and tons of linking back up to the 
Upstream providers um both in like whatever   prose narrative that we write in our example 
notebooks um and then also in structured ways in   stack so stack has uh structured place to put the 
scientific citations back to the original papers   um places to put the links and Licensing and all 
of that so um I I think that's like at a minimum   uh that's necessary and that's what we're doing 
and then if you all have suggestions on how to   better surface that critical information then I 
am all ears to hear about how to do that better   so the thing is that we are not doing yet as far 
as I know cover me from run Tom is this provenance   or with the data exactly comes from and I had this 
idea maybe I don't know if people like this idea   is to add a metadata tag that provides the md5 
has of the file at the source so you have a kind   of like a chain of Integrity okay you can have 
them device of our file but you also have then the   device has all this of the source yep yep so uh we 
uh are planning to add that at some point that's   one of our work items um that I think will help 
a lot um especially when the Upstream providers   uh have stack metadata uh there's a again a 
structured place to put that uh information   about the files themselves the nd5 hashes um 
all sorts of things about the files themselves   um and in our stack catalog and we do this for um 
uh landsat 8 because USGS has stack metadata we   have a it's like Avaya or uh some way to indicate 
that this is the Upstream provider stack item so   you follow that and then that stack item has 
links to their the assets on the USGS server   so you can perfectly track it back again you 
know there's issues around like well what if   the data changes what if they update the data and 
there's again stack extensions for versioning and   so there's it's infinite complexity but there's 
uh I think a path forward Tom I see one of the   questions I'm sorry for coming in there and Peter 
but it's a question that everyone seems to ask and   I haven't been able to answer it it's great to 
have you Tom here live about kerchang oh yeah uh   actually let me jump over to that screen sorry for 
a call full question uh career chunk okay uh yeah   yeah so like archival data uh there's so much of 
it out there I don't necessarily want to convert   all of it to Cloud optimize just because of the um 
cost of doing that and there's so much um existing   uh processes built on top of those files in their 
current formats so for those not familiar kerchunk   um oh and um I heard about a similar project it's 
uh it's something is it DDR plus plush HDR plus   plus uh some effort at Nasa that's very similar 
to kerchunk where you kind of uh scan these files   these uh let's not call them Legacy files but 
these not Cloud optimized uh files and figure   out where the assets within them are at like so a 
single net CF file with many groups many variables   um that's chunked up uh where does this you 
know temperature variable start at where does   precipitation start at in the file in the 
byte stream um this is is so useful because   um the the uh performance of of a file system 
like Azure blob storage is very different from   a local file system with a local file system you 
can open up the files seek all over it's not going   to take that long but with a remote file system 
like Azure blob storage it takes a long time to   figure out where in the file uh these different 
pieces are so jumping back to kerchunk and DMR   plus plus thank you for that um what's what the 
the idea behind these is to have a pre-processing   step where you scan the data scan each asset and 
then write out the sidecar file that has kind   of the locations of each variable each chunk with 
uh within that that netcdf file dollar grid files   byte stream then the idea is what if you hook up 
you have that you end up with a Json file that's   like a URL offset and then length and we combine 
that with the thing Bruno mentioned earlier about   HTTP range requests once you have all of those 
you can make range requests and and fetch just   that data so you have kind of all the metadata 
that you need to you know build your data cubes   you know exactly where in these netcdf files each 
chunk is at then you can get Cloud optimized data   access to these netcdf or Grid 2 files files 
that don't necessarily work well in the cloud so   um you know as far where that's the idea and 
as for like the reality like her chunk is is   very new it's like kind of a project that some 
folks from anaconda and the Pangea Community uh   are just kind of like you know hacking on it it's 
it's and we want to be we think it's a very very   promising way for word it needs a bit of work and 
we're you know we're working on that if you all   want to become involved in that then definitely 
do examples and and fixing bugs and things like   that but it seems like such a promising way to to 
get this uh uh again not legacy uh this non-cloud   optimized data exposed in a cloud optimized way um 
I just want to share this quote from Paul Ramsey   who invented postgis um I don't want to mangle 
it but it essentially says like uh it mentions   how there's all this focus on cloud optimized 
formats when which is necessary but the really   really important thing and the really challenging 
thing is getting clients Cloud optimized clients   to make uh efficient use of those well-organized 
bytes and so that's really what kerchunk is kind   of taking to the extreme is what if the bytes 
aren't well organized what if they're just as   is as they're written you know 20 years ago 
or whatever uh but what if we have a super   sophisticated client who's able to kind of like 
uh do these cloud my requests on the Fly that's   the basic idea behind kirchunk and In This Cloud 
optimized access pattern and the status of the   plan they're completed is that we are looking 
closely at those Technologies we are playing   um last I remember last time that you were saying 
Tom that we're trying to figure out we have not   yet imported anything on karchunk or others 
but we are actively seeking feedback from the   community so we come there again as I was saying 
before the opinion I didn't say let's go with this   and I I believe that we could either compete or 
others say hey let's just use kirchang it would   favor the odds of that standard but we wanted 
to choose something that the community thinks   is the right one right yeah so we have a called 
experimental uh reference files for one one data   set then uh NASA's uh next gddp cmap 6 data set 
um where we made uh reference files for some of   those again software's young they're just like 
uh uh bugs and errors if we tried to do it for   every single one like projections into the future 
uh with like you know date times past some Val   invalid range so there's a lot of work to be done 
but uh it's I think it's really really promising great thank you um Tom we wanted to follow up 
on on some of your experience with with stack   um stack uh assets and x-ray Das Matt plot lib 
they're still rather low level user interface   compared to some of the object models of GE 
and open EO do you see a need for a higher   level interface something where more powerful 
abstractions that might service more low code   users yeah um maybe maybe this is uh something 
I struggle with it's like a blind spot of mine   because you know I'm I I like coding I think 
it's like a very powerful expressive way to do   this kind of analysis assuming you know how to you 
know code in python or whatever so I absolutely do   think yeah and I should say like there's the other 
end of it it's like the Explorer where there's   um uh essentially no code like you're you're 
manipulating the UI to generate the queries   and then it Returns the results um which is uh you 
know even for you know uh me like it's extremely   useful for very quickly visually debugging 
things and then it can kind of get you started   um on a path by you know showing you the 
code it used essentially behind the scenes so   um I yeah I guess I'm not quite sure I think I 
do think there's absolutely needs um within the   um I guess python Community for sure but 
um others other communities as well for   um better ways to work with this type of raster 
data um that uh well uh how do I say this like   uh yeah anyway I don't want to get too into the 
details but uh you know x-ray data data sets data   Rays they're very focused around the ideas of uh 
a regularly structured grid a kind of rectangular   rectilinear data Cube uh which is uh very nice for 
many data sets but it doesn't kind of accurately   capture like the path of the Sentinel landsat as 
it goes over the Earth like how do you represent   that is it more like a data frame or a fancy 
list or like a tree of data so anyway there's   lots of uh I I completely failed to answer your 
question because yes about low code I'm talking   about like other very complex code things so 
anyway that's kind of kind of my thoughts there   great well it sounds like it's uh it's a challenge 
on the horizon uh yes absolutely something that   may need may need some more attention um I am just 
I apologize I'm jumping back and forth I'm I'm   moving some some URLs over um making sure folks 
uh just we had a question come in about when uh   um when the the video might be available and 
what else might be available and just real   quickly we are stocking these on our project 
website I uh just provided the URL to that we   hope to have a video available within a few 
days it kind of depends on how quickly we   can turn that and then based on that video 
that that and the transcript is is helping   us put together a kind of a high level summary 
as well of some of the Q a and some of the the   um the key points that were made and that will 
be available a few days later again trying to   tie as much as of the conversation and discussion 
back to some of these main framing questions but   also really trying to pick up um the questions 
that have been coming in from the audience that   are kind of above and beyond those those framing 
questions um so with that in mind let me turn to a   couple of those that are above and beyond um we've 
got a question around support for non-python users   um how do you support that or or will 
you if you don't at the moment yeah   um so if you're uh there's the Explorer if you're 
not familiar with coding which is like a great   way to visually inspect and understand uh some 
of the data sets and hopefully non-raster data   sets in the future there is uh so all of our 
catalogs of as far as like uh discovery of   what data sets possible data sets are possible 
are cataloged stack is an open specification and   there are clients in lots and lots of different 
libraries so you can equally well use a stack   from any language really any language that can 
do HTTP and then we've worked with the developers   of our stack from the Brazil uh I'm gonna it's 
like inp their their space agency um to to make   sure that uh our stack their client library 
for R works well with the planetary computer   um and we uh so that's like the the stack side 
of things and then going up an additional level   of of like for the Hub specifically for compute 
we do have um r r the programming language are   um profiles as well so you can start up our 
kernels um and use all those libraries that   that uh geospatial analysis tool chain um we also 
I guess maybe somewhat better answering your last   question about lower code we also have a qgis 
profile where if you want you can go into that   um it starts a cujit server in Azure your browser 
you're still accessing it from your browser   locally but like the cutest compute happens in 
Azure close to the data so if you want to that   graphical user interface to the data we also have 
a qgis profile you can start up so those are kind   of the non-python-centric ones that that users 
have today and then again you know open source   is if you can work with the stack metadata 
you can use whatever tool chain you want   I'll I'll also add that um again that is very 
modular there is we have customers commercial   customers that just use the files and they don't 
touch anything else but that's fine we have the   customers that use the metadata API through the 
computer Hub that's great in the python kernel   or R kernel or the these qgis on the server 
but if you have a Q yes you can also connect   to that totally fine either way did we also have 
commercial customers who are then using the HTTP   request in their own virtual machines doing 
whatever they they want to use that's kind   of the beauty of being um modular in that sense 
and we have the PC Explorer which not only is   great for people who do not want to code or 
don't know how to code it is fantastic also   for people who do code to quickly grabs that 
snippet of code that I showed that shows the   the lines of code for the region and you can 
change it from python from other things but   the idea is again to provide a bridge between the 
no code the low code to the actual doing the work   I don't think this is a conversation I constantly 
have with with a colleague Matt which is the one   developing the PC Explorer this is such a danger 
for the PC Explorer to scope creep because there's   so many things you could do and you I think you 
have to put a an uh limit on those otherwise   are you trying to make a qeis or rpis in the 
browser now it's going to be needs always to   then do something in whatever two videos which 
is one of the questions Katie just passed that   um on on the whatever tooling they they use we 
do have a I would say to your question Katie   of the academy users roughly one third of our 
users are Academia one-third are commercial and   ones that are mixed so it's not we're not 
a research tool if anything we may be more   our Enterprise and that's also what Microsoft 
likes an Enterprise level platform that is also   fantastic for academic and research use by Design 
again so that to minimize research to operations thank you yeah it looks like Katie's that 
your your response is really resonating with   Katie I would imagine others as well um kind of a 
detailed question I mean a very specific question   um but probably important here um does the 
planetary computer handle elevation altitude   data as well uh yep so it's with uh mentioning 
though like um you know where we're uh using   stack for our uh cataloging um so we don't it 
doesn't really matter what the data within it is   um it certainly works well for raster data but 
it works very well for essentially any any type   of spatial temporal data um and so when if you 
have data stored in tsar and that CDF that has   multiple levels that's totally totally doable 
and you would access that you know you would   search it normally through the stack API 
you would access it again normally through   you know x-ray or whatever in-dimensional array 
Library you're using to work with that data and   then kind of the thing that you know maybe is 
less figured out is like how do you visualize   that in something like the Explorer which 
is again currently mostly focused around   um kind of a single spectral band for a single 
um uh altitude but you could easily imagine uh   ways to like have a slider to adjust the adjust 
the elevation or the altitude uh based on you   know your selection there so yep it works well 
yeah it works it's totally doable it's just   like uh depending on the exact nature of the 
data set uh other some things might not work very good um similar uh you 
know NASA has some data sets   um specifically um they've they've got uh open 
population raster data cdac um is what's the   best way to get data added to the archive so it's 
available to all yeah um let's see two answers so   uh first of all like uh we recognize right now the 
planetary computer team is kind of responsible for   um kind of maintaining relationships with 
all the data providers with um uh depending   on the data set uh potentially doing the Cloud 
optimization conversion potentially doing uh the   stack metadata creation and then ingestion into 
our database so there's like a lot of things that   we become the bottleneck for so we're hoping to 
improve a lot of the tooling around uh creating   stack metadata and and getting it ingested and 
and all the stuff around the metadata side of   things and and sharing side of things to make 
it easier for any group to share their data   um uh through stack on Azure blob storage so uh 
that is like in a thing that we're we're working   on um for now the process is to kind of reach out 
to us I'll put a link in the chat in a second um   we have like a data set request page that you can 
uh fill out and and we can take it from there and   um it's just like a lot all these data sets 
you know they're all unique and they take   um a good amount of effort to get them kind 
of cataloged correctly so that they're usable   um through for as many people as possible and 
just to add to that the best way to get your data   set on board because again it's a it's it's just 
blocked by us it's just a long tail this is a very   long tail of data sets help us by making sure the 
data is the format that is also clouds optimized   as possible if that is possible uh the metadata 
fields are are clear ideally with a stock stack   geojson or schema Fields things like that that 
makes their life easier and then most importantly   who would use this for what because if this 
data set that is interesting that's great   but it's a data set that you have identified who 
would be using this for what it really helps us   um prioritize it if it's in the order of petabytes 
then the conversation might be a little bit   harder is less than that this the volume is not 
really that much a problem also if it updates   periodically or is a one-off some datases we have 
updated every year or just once some update every   few minutes so that also uh it's some criteria we 
have for optimizing it as I said it's not really   that we are there's a secret Channel or something   it really is a matter of prioritizing 
until we develop this juicer ingestion very good thank you um let me see if there are any 
uh particular questions that might be coming from   um panelists who would like to to come online 
and maybe ask a question we're we've worked   through all the questions that are in the queue 
um thank you very much I mean we we work through   a long set of questions that really appreciate 
the discussion um we do have time I just wanted   to see if there's anyone um with with the 
panelists uh who who might want to come on I'm I will I will try to reciprocate and ask 
everyone that if you can please share with   us directly on that email um any feedback 
you have what you like the most what you   like the least anything really helps we are 
we're building computer and we are building   it very openly because we want to be as useful 
to you as possible to make a change and to make   the world with the place I'm literally it's 
literally we're trying to do and we have an   amazing opportunity to to shape it so please 
be candid and be to reach out with feedback very good that's a very that's a wonderful 
offer um and uh I I know folks we you can   see in the in the chat folks are really 
appreciate the time you've spent with us today   um and and really appreciate the really the two 
complementary perspectives that that you brought   to the discussion and brought to the conversation 
kind of the high picture or high level big picture   look at the whole platform and also the the 
technical details of you know the how and the   why and the what um just a fascinating 
and very very thoughtful uh combination   um thank you to you both for for bringing 
those perspectives and you know working them   back and forth so elegantly it was very nice 
um I'm not seeing any uh additional questions   um let me uh um Hannah can you bring us 
to the to the closing couple of slides um sorry putting her right on the spot 
thank you Hannah I see it happening well so building on this this series that we 
we have we're we have some uh upcoming sessions   um scheduled we're looking at the San Diego 
super computer center towards the end of of   September and video will be here early in October 
um sandia's Center for computing research that's   a new one on our schedule that's coming uh 
September 22nd Pittsburgh super computer   September 23rd and then esri coming in at the 
end of the month in in October October 21st and   we also have the Texas Advanced Computing Center 
coming in the 5th of October so we have a number   of sessions in the next week or two um and then 
we'll go to esri towards the end of next month   um all of these are building on this this 
series of questions that we've been asking   um and uh that that Tom and Bruno have been so 
kind to to Really wrestle with um with us and and   and with each other um on on how to think through 
some some questions that are particularly uh   challenging and timely for um the SDM uh data and 
Computing infrastructure project and study and uh   the whole move towards uh open source science um 
learning lessons from folks who've been doing this   um for quite a while uh both inside NASA and and 
elsewhere and but also thinking down the road   um you know what's what's what should we be 
anticipating what's coming at us um so thank   you Bruno and Tom for helping be a part of this uh 
the study and this really important conversation   um with that yeah um we have a a couple of 
nice um thank yous coming in in the chat   um I think we're all set Elena um 
would you like to come in yes well   I just want to say thank you so much uh 
for speaking with us today we really enjoy   your talk and and all of your insights 
so um yeah thank you very much thanks all right with that um thank you 
everyone uh for your time on this   Friday we we came close to 100 we 
topped out about uh just shy of 90   um but not bad for for a Friday um and covering 
uh so much of the Waterfront uh Coast to Coast   literally so thank you everyone have a great 
weekend and we hope to see you at a future session

As found on YouTube

PEOPLE – SERVICES – IMPACT

Your email address will not be published. Required fields are marked *

Copyright © The Vega Family Foundation. All rights reserved.