CSSE 533 – Database Systems Week 1, Day 1 Steve Chenoweth CSSE Dept We’ll look at… Your goals for the course Course syllabus Course schedule Books Data mining material What we’ll do at meetings What environment we’ll use (Mike’s) Initial goals for PostgreSQL What we’ll do at our second meeting 2 Your goals for the course Suggestions (see also next page): Learning about the two database systems PostgreSQL MongoDB How these relate to your work Learning about data mining How these topics all relate to other projects you are doing / want to do Project scope you’d consider doing 3 What you said (Jan 20 emails) Jon: The areas that interest me with respect to Database Structures are as follows; Database Design as it relates to Big Data and Machine Learning Are these relational or flat mutually exclusive tables? What are the suggested high performing designs with respect to the domain of machine learning and big data PL / SQL | Stored Procedures I did some PL/SQL in college but would like to see what the latest trend is with regard to stored procedures and the like for this space. That’s all I can think of at the moment but didn’t want it to escape my head. Mike: I’m on board with learning more about stored procedures. I’m very familiar with a lot of the details surrounding them, but I really only wrote them myself way back as an undergrad. So that sounds like a fun itch to scratch. I suppose that, at the moment, I don’t have anything in particular that I’d like to learn about just yet. I’ll speak up if anything comes to me. 4 A few quick questions What is Interactive Intelligence’s current use of both these databases? How are they (you) doing data mining now? Can you latch onto whatever the internal user groups are for these? Can you get them interested in whatever project you would like to do? 5 Course syllabus Main goal – make it a chance to learn moe about database systems you care about And learn about using them for data mining Left open – what the assignments would be, and when due Happy to take suggestions Let’s look it over 6 Course schedule I divided into – Week 1 – figure out what we want to do PostgreSQL – four weeks MongoDB – four weeks Data mining embedded in each of these Week 10 – Project demos & lessons learned 7 Books Both are handy dandy O’Reilly’s Both are useful for practitioners who have a larger role, like needing to do some systems admin work. Both assume you already know basic stuff like SQL. We could either: Have reading assignments? Or, Just use them as references? 8 Data mining material There’s lots all over the Internet. Basically, it’s a combination of: The Machine Learning stuff we did last term, and Using database systems like these two, to get the data, and If you have really huge data, using tools like Hadoop to run algorithms on many machines at once. 9 What we’ll do at meetings Talk about the DBMS of interest. Talk about how it can do data mining. See demos of things you want to show. Discuss issues and progress on projects. What we’ll do each week – see next slide 10 What we’ll do at meetings, cntd What we’ll do each week – For each DBMS, I divided the 4 weeks into these parts: Learning to use it – Working on it with real data See if you could get data you like to be formatted for something you want to do. Applying it to data mining Try things and ask questions and make plans Try applying some program or query to it. Applying it to your project Show how you could actually get something of value. 11 What environment we’ll use (Mike’s) Maybe tonight and/or Thursday are an intro for the PostgreSQL installation? Plus see what happens when we all try to use it at the same time!? Other considerations? 12 Initial goals for PostgreSQL Your goals for using this DB What’s special about it, as you see this (We’ll talk next week about what’s special about it as the world sees it.) Thursday – How it might fit with your project? Other learning goals for it? 13 What we’ll do at our second meeting More details about your likely project. Where it stands – e.g., Do you have data ready to try loading into some schema on either DB? What in particular might be doable as an exercise in using PostgreSQL? More initial demos? 14