Visit the Open Source Cyber Intelligence - Introduction Training course recordings page
WEBVTT--> Testing, testing, one, two. Can you all hear me? Testing, testing. Are you all able to hear me? Testing, testing. If you can hear me, give me some form of indication so I can begin. --> testing testing testing um i'm gonna give it about two more minutes --> if there's any way you can give me an indication that you can hear me --> whether you can click on your home your home folder just so that i can know that we have a --> connection testing testing are you able to hear me testing testing hi how you doing today i can't --> complain trying to get a few things in order last second of course um are you able to see --> my desktop and the powerpoint presentation yes sir yes ma'am awesome uh are you able to see the --> slasher slide to show so where are you did you go to training room yes ma'am okay so you ready --> to get started so welcome to our intensive training on open source intelligence which --> is a strategic and capability what strategic capability that when executed correctly can --> shift how you see incident response right anything from reaction to readiness to like --> even speculation and proof right so in this course we're going to talk about forensic grade --> human alliance type security in regards to the public scene right the things that are --> able to be found based on google youtube and just anything that's just front-facing readily available --> to anyone so before we begin do you mind giving me a little bit of information about like your --> background if you're in cyber security new to cyber security and the source do you do you --> have a background in cyber security oh okay okay okay so may i ask what your background is in --> Yeah. Okay. So let's say, for example, right, open source intelligence comes into play with Vietnamese, right? With being Vietnamese or having to major in Vietnamese, right? It's because let's say, for example, you're in America and you want to know about certain things going on back at home, right? Open source intelligence is what allows you to keep up with those things, right? --> It allows you to have information to networks back at home. --> It allows you to have attachment to family. --> It allows you to provide security. --> It also allows you to keep up to date with, like, the current events and just the identity of how information travels, right? --> So we're going to begin by establishing foundational definitions, reviewing, like, core use cases, right? --> So we're going to walk through what OSINT is. --> We're going to walk through why OSINT is necessary. --> we're going to walk through real scenarios like real real world case scenarios where --> a person has run into an issue and we use open source to kind of figure out the solution to it --> so um my name is junius whitaker i own a company by the name of intelligent securities who works --> with noble prog and uh providing trainings and walking in the door for understanding what --> cybersecurity is and how it's performed um with intelligent securities we work with what's --> considered a psychology driven approach so there's going to be a idea for how a criminal or a hacker --> who's trying to gain access to uh pornographic photos how they're psych how they're psychologically --> going to travel and navigate that is completely different than the person who's looking for say --> bank credentials right and those former open source intelligences look completely different --> based on the type of data you're looking for so open source intelligence which is osent this --> refers to the collection and analysis of public available information right to produce actionable --> intelligence right so one of the things in cyber security that we're required to have before even --> beginning is we have to have a network right you have to have something that the information was --> traveling on that you're trying to investigate you have to have a device and you have to have --> have an individual that's attached to the device, right? So sometimes we might have an incident --> where we might not have all three of those things, but open source intelligence helps us --> get that, right? So let's say you may have a device and you may have a network ID, but you --> don't have an individual. However, open source intelligence through via Facebook will allow you --> know that the person using that iphone 14 and the and the indicators attached to that phone are also --> the same ones that uh that that that was successful in that breach or in that that thing that wasn't --> allowed to be done does that make sense okay so they're on her desktop and if she and if she --> and i believe that if she goes into the materials it should be there already --> the powerpoint should be there and there should be a student manual there as well --> sounds good awesome thank you so we're at the course overview --> all right so we're going to go through the introduction to osin osin methodologies and --> tools key ethical guidelines for conducting osin we're also going to use hands-on applications --> using kali linux which is why you have the operating system in front of you that you have --> and we're going to go through real world case studies we're going to actually apply some of --> these things at the end of the course so you can see how it works uh do you have any experience --> with linux okay well it's not it's not that much of a beast we're going to get a little bit of a --> tutorial there as we go through these things and what you'll find is that this operating system --> that you're in right now kali linux is usually our go-to in the cyber security field just because --> if you press that see that dragon in the top left corner of your screen --> see that dragon at the top left corner if you click on it yeah see that tool is there each one --> of those stands for a different process and something in cyber security and we use this --> operating system to essentially assess eradicate or sanitize things going on with your system or --> your network so that's why we are here right so so let's start with what open source intelligence --> is right open source intelligence refers to the intelligence gathered from publicly available --> sources right so websites social media platforms public records news outlets forums and databases --> right and the purpose of this information is to support decision making and security --> investigations and threat analysis right so in the last 10 years let's say for example --> i've had a situation where a family reached out to me because someone had sold their mother's home --> and the mother had died and it wasn't until they sat down to settle the will that they realized --> that the home was removed from it so in that situation we ended up going to public records --> right and what's considered probate to find out that there was a situation that occurred --> as to why the mother's house was sold right so rather than them having to go in this crazy --> spiral downhill we were able to find information readily available to them so that they could --> understand how to address the situation does it make sense uh my apologies i hope that my --> asking to does it make sense or for clarity doesn't doesn't offend you it's not any intention --> of assault so in the united states intelligence community and nato we classify osin as one of the --> core principles of intelligence life cycles right so a distinguishing feature is that it --> relies entirely on legal publicly accessible sources right so we talk again websites forums --> social media platforms government records and exposed infrastructure metadata um --> um have you watched a miniseries on Netflix named don't F with cats so there's a there's a there's --> a Netflix miniseries by the name of don't F with cats and uh-huh yes so if you get a chance it's --> something that you could go back and reference and the entire series is about open source --> intelligence so the premise so the premise is that there's a Facebook forum or Facebook group --> 2008 about people who love cats but someone comes into the forum and they make a a post of they make --> a post that's dangerous to that's a person doing things to cats so using open source intelligence --> they scoured the internet they found the doorknob in the room they found the place the doorknob was --> sold they found the computer in the room they found the serial number for the computer they literally --> use all the public facing information available to them to track down the perpetrator that was --> hurting and so if we can yeah yep uh-huh yep so in the context of cyber investigation is also --> is not merely about fat and data it's about establishing truth through digital trace right --> it's it is forensic in nature strategic application and often the first and most critical phase in --> incident attribution digital recovery risk recovery and breach remediation so before we --> can even get to what we consider methodologies right which is how we apply things it's how we --> engage it's how we as cyber security professionals come to an outcome and then provide our outcome to --> someone else and then they're able to come out with the same outcome based on following the same --> steps that we take so we went through how osin is used security investigations corporate research --> so on corporate research right there's a lot of times when you can have like a merger acquisition --> and things detrimental to the brand you already service can be affected by brands you're buying --> so you will hire a person like me to come and do a assessment or a risk assessment more so to --> address any concerns that might be ahead and bringing on this asset to whatever it is that --> you already acquired so um when you're checking competitive footprints you're identifying domain --> spoofing you're mapping supply chain vulnerabilities right uh ups has that tens of thousands of miles --> each day they're driving based on the logistics and the open source intelligence provided to their --> drivers right so even when you think about your google maps when you think about ways things --> like that they're also contributing to the space of open source intelligence so when it comes to --> legal and compliance or law enforcement you can use it for tracking criminals and activity and --> locating individuals you can also use it for verifying geospatial evidence and as well as --> like marketing extremist activities online right so sometimes you don't want to infiltrate sometimes --> you don't want to engage you just want to make an assessment right because sometimes you have --> assessments that need to be made so that there are resources available if time should need to --> exist or you need to uh advance in certain ways of engaging with certain circumstances --> um threat intelligence is like where we would identify and analyze like potential --> that's to an organization so it could be not only a client that we have but it can be a competitor --> of our client depending on how severe the information is right so that also would threat --> intelligence could fall into like investigative journalism and human rights research right so --> open source intelligence in these spaces consistent like validating source credibility investigating --> shell companies and just forming geolocation on visual contents right so that's where we start to --> transition into like you found a photo getting the geo and from getting the data the metadata from --> the photo finding the name find a location beginning the investigation right so each of --> these sections each one of these sectors osense utility is multiplied when combined with --> investigative rigor technical precision and ethical risk joint while many platforms --> can be used to conduct osent like cali is preferred just based on the operational --> environment and kind of like i was telling you before in regards to like forensic and adverse --> excuse me adversarial adversarial investigations so it was created by a company main offset --> official security and they provide over 200 tools specifically designed for penetration testing --> rare teaming digital recognition and also just creating a complete threat map and understanding --> for what occurs in the system in the situation right so the reason why we went into debt right --> i'm sorry okay so we we use cali as our base for ethical reasons because if we all have the base --> foundation of cali then we all can apply the same things and we can all get the same outcomes --> so there's never a point in time when one cyber security professional is put against --> another because all we're dealing with is factual information um --> also like another thing to think about is maintain cali is one of the largest maintained tool --> libraries when it comes to cyber security so it maintains consistency and investigation --> of workflows ensures compatibility with forensic scripting and automation and it isolates --> investigative activity within a secure virtualized instance right so you want to always make sure that --> your environment is sterilized away from other people and cali helps us do that right so in --> today's course talking about cali we're going to be using um shodan right which is a search engine --> to the indexes internet connected devices right it invades with this uh i'm sorry okay i'm sorry --> uh so the reason why i'm doing this because for some odd reason it sounds like i'm having --> feedback in my ear which is fine if you don't hear it on your end uh so showdown it's a search engine --> right that indexes internet connected devices so what showdown does is it takes your ip address --> it takes your mac address and the stores in the library and whenever people like us start looking --> for that data it just generally presents it to us on the silver platter right so it enables --> visibility and to expose services surfaces right so unsecured webcams industrial control systems --> is sometimes default consecutive service right so one of the things that i tell all of my clients is --> that when you first buy a device the worst thing you want to do is just install your personal --> information on it because let's say for example if you own an iphone there are 700 million of those --> iphones and each one of them are a carbon copy of the last one so it's until you go in and you --> manually configure your phone to have the protections as necessary you basically all --> have the same device and it's only 16 to 32 characters that's differentiating them so if a --> a person has enough time to to attack a device right then it's inevitable that eventually that --> password will be broke so because of that we always try to recommend that people um stray away from --> default configure service or default configure devices does that make sense so from there we'll --> go to mall ego which is a graphical link analysis and relationship mapping tool right so you use --> that to correlate your metadata across domains social accounts infrastructure and identities --> it creates a map of how this thing attaches to all of these spaces in the rest of the world --> of internet so from there we go to who is looked up which is the information that we --> will find while using what mark eagle and showdown that's going to provide us ownership --> and registration metadata for domains and ips and then often that's the first step in mapping the --> threat active infrastructure right so once we have an ip or domain we would we would we would --> document that and let that be known as our starting point so a lot of times in this this is why uh --> open source intelligence is always our go-to and our first stop in an investigation so usually here --> after we find data on who is look up the next move is google dorks which is an advanced search --> operator that reveals exposed files configurations and survey artifacts indexed by public search --> engines right so we think of websites when we enter the link but in all actuality there's --> dozens of other actions current on the background on the back end one of those things is the logging --> and the events uh allocation that's stored for uh in case later on down the road there's an --> instance where it's broken and you need to go back and pick something right so in the process --> of using google dorks you're probably going to end up coming out of that situation with the dns --> information for the investigation and that's when you use dns dumper right dns dumper is --> a reconnaissance platform that maps sub domains mail servers dns records and they all are tied --> to the domain of interest so each one of these tools represent like a discrete intelligence --> discipline right from infrastructure mapping to digital relationship analysis collectively --> and collectively they form the foundation of what's structured as osin methodology --> right so in this methodology using the public data provided to you to get your ip address which --> will be your network identification you're using that network identification to find which device --> was used do you ever notice how sometimes when you look at your metadata it shows that you were --> in your safari browser using ios 18 uh using this ip address well that matter data --> then returns into where we began looking for personal information that also will correlate --> with other information that we found right so that's how we end up with our methodology for --> osin so back to ethical right back to ethical guidelines for osin uh legalities are that they --> must like the legality of osin is that it must adhere to your local laws and regulations right --> so if you're governed by soc 2 gdpr ccpa whichever industry whatever whatever industry or regulation --> regulatory board that services your industry whatever their process is for digital security --> is what you must follow and network security um privacy is that you do not access private --> or restricted information a lot of times we come into circumstances where people tend to --> blur the line of what's right and what's acceptable right it's one thing to gain access to information --> based on what you found publicly something completely different to find information that --> requires uh infiltration on a network or on a system that you are giving permission to --> so the transparency is that your data can only be from public accessible sources --> and that you can't use social engineering so social engineering is when you use social media --> platforms and information provided in social media to construct an identity or a relationship --> or a avenue of fraud to be able to gain access to a person's personal information --> for your legal considerations in osin happen to be gdpr compliance right understanding how --> personal use is regulated how personal data is regulated in the eu but also just across the world --> uh there's multiple different compliance systems that exist but all of them generally --> tend to follow under fall under the under the guidelines of gdpr whatever your the terms of --> service regardless you have to respect the terms of service of a platform in their website --> some websites although they might make this information public they have strict laws and --> restrictions as to how you can use it right and then you have your data protection laws which --> require that you comply with the laws that govern the collection storage and sharing the personal --> data so yeah talk about cali so now we're back we're starting at the osin methodologies so --> where we just left off in regard to osin methodologies right --> based on how based on the criteria and the request of your client your open source intelligence --> you're always going to first start with your information gathering so you want to always be --> able to have a strong starting point in regards to where you want your research domains to be --> what your ip addresses are your social media platforms and more right so let's say for example --> you have a client and for whatever reason they are randomly online getting these messages from --> a disgruntled employee disguised as a customer who is dissatisfied right well you would have to --> have that client provide you with the domains that that information came from the ip address --> that the domain came from the profiles and based on that that's going to be the beginning of where --> your research starts are you with me all right so from there you're going to correlate and organize --> that data from multiple sources so it's not just going to be facebook you're going to pull the --> pictures from facebook and then that's going to be a completely different source you're going to be --> able to scrape the internet with the picture from facebook and you're going to be able to see if it --> exists anywhere else in the world you're going to be able to take the metadata from that you're --> going to be able to find a geolocation or better you're going to take the metadata from that --> you're going to be able to research with the geolocation to see where that instance took place --> you want to be able to see if that picture existed in the format that it was taken or if it was --> uploaded to that time frame and that information is going to go into your report and it's going to --> create a clear and actionable manner or clear it's going to be provided in a clear and actionable --> manner that allows your client to know what their next steps are so as we've already talked about --> our tools and oscent we have google dark thing that's used for advanced search queries to gather --> information from google the showdown that's used for discovering vulnerable internet connected --> devices more ego for mapping relationships and data visualization recon g for automated recon --> framework uh recon is just another word for information gathering in cyber security it's --> just the first phase uh who is is the query domain registration data so if you have a website or if --> you have anything that's publicly housed on the internet it's going to have a registry with who --> is not am i going too fast for you okay awesome so case study investigating the suspicious domain --> so we're going to use open source right now in a short time frame just creating a scenario to be --> able to show you how like open source will work me how open source intelligence will work so the --> tools that will be available to you are who is in the map dns dumpster malt ego you'll notice that --> i added a tool that wasn't communicated because i want to be able to work with you and building --> your skills from the beginning and how you want to search for data right so in this scenario we --> have a fraudulent website that is suspected of phishing do you know what fishing is okay so the --> tools you get to use are who is in map dns dumpster mall ego so step one is use who is to --> gather the domain registration information right and then from there you would acquire who the per --> what the person's name is you would acquire their their domain you would acquire all the information --> provided by their host to give them an identity so the next thing you would do is use an end map --> you would discover the domain i mean i'm sorry using dns dumpster the next thing you would do is --> discover the domain the subdomains so the domain domain is what's going to be your connection and --> then your subdomains what's going to hold all of your directory styles and folders so from there --> once you have the subdomains and you understand how essentially their network is laid out ip-wise --> or domain-wise the next thing you want to do is scan each one of those ips for open ports using --> nmap so those ports are used to communicate back and forth with several with with various devices --> over the network and over the internet to be able to get a series of outcomes presented on your --> screen or in the background from there based on those open ports you would use maltico to --> visualize the domains relationships so in real world applications of osint security teams use it --> in real world applications osint is used by security teams to identify external threats --> and vulnerabilities it's used by investigative journalists for exposing fraudulent organizations --> it's used by us in discovering compromised systems and exposed devices and it's used by --> governments for national security and defense applications so as we continue to go through --> you'll use Kali Linux tools to conduct OSINT research on the sample target tools you're going --> to use again are going to be google dorks who is shodan and dumpster so i keep emphasizing these --> because these are always going to be your fail safe tools uh i've seen in the past where if i --> don't reiterate these things like maybe five or six times in the beginning then the end result is --> someone not understanding the full capacity of ocean so i hope you can bear with me on that one --> so key takeaways is that osin is a powerful tool for gathering intelligence from publicly --> available sources kind of linux resides or provides a rich source of tools for open --> oscent investigations allows uh it allows it always follows ethical and legal guidelines to --> ensure responsible use of osin and the hands-on practice of it is is crucial for mastering osin --> technologies and techniques so one second right so a lot of the things we're going to do at this --> point are going to be on the fly so are you ready for the hands-on aspect all right so before we --> start can you share a real world experience from your professional experience or something you see --> on the news or something you went through in education where osint might have played a --> critical role can you hear me oh so can you share a example from your world where osint --> would have played a critical role based on what you've learned so far uh-huh so and what as what's --> a space in life that you can use that i said when in life have you used that to get an outcome you --> were looking for like have you ever been online and seen an ad and seen that it had a shirt that --> you really wanted but didn't know what the shirt was sold so you searched the shirt in google and --> and was able to find that it was an h&m so that would be a sign or example of open source intelligence --> right because you have one item that exists in one plane in one space and it doesn't give you --> the outcome you're looking for so you use the information provided by it to get your outcome --> based on the tools that i communicated so far right regarding --> so in regards to google dorks shodan malt ego recon ng and who is which of these tools are like --> which of these tools did you find the most interesting exploring further and why okay --> so why malt ego yeah it's a pretty cool project um and why who is all right so those two tools --> that you chose such as one malt ego is going to help you map the relationships and the data --> visualization right so a lot of times when we're going through our investigation we're left with --> mountains upon mountains upon mountains of data and it gets wearisome looking at it and malt --> ego helps build out a map so that you can get an understanding for how this information interacts --> with one another right and who is is a very very unique query domain registration system because --> it's nice to see that there's a database for every single website in the world right and knowing that --> there's a certain level of responsibility and structure to how we navigate things makes it a --> lot easier when there is an issue or there is a circumstance right so in this first moment we've --> been you i've been introducing you to osin as a discipline right grounded in legality transparency --> and strategic utility you've seen how catalytic supports forensic grade transparency and --> investigations and you're reviewing like the primary tools that you'll be applying --> in the hours to come right so coming up we're going to shift our focus to like sources and --> methodologies right we're going to start beginning with like data classification asset enumeration --> and cross-platform pattern discovery right so um in this last ref in this last in this last time --> frame we've been using the osin overview we've been using kali linux documentation --> we have shodan overview and we have the who is lookup tool right so if you want i can provide --> you the links to those and their wikipedia's i can communicate to them to you and to to be able to --> look at them in the browser now and i'll give you about let's say five or ten minutes to just get a --> a brief genuine just idea of what these things look like before we begin --> does that work for you all right so what we're going to do give me one second may i ask where --> you're at are you in the united states are you in another country nice are you on the east coast or --> the west coast i'm on the east coast too i'm in washington dc i'm in washington dc nice look at --> you right down the street unfortunately i am yeah are you at home right now --> uh it could be worse so the first website i want you to go to is --> if i can pull this down now that's what we're getting uh if you look on your desktop you should --> see a copy of the slideshow and yep i can't see a screen right now i went back give me one second --> yep this one here yep that one right there yep so you want to be able to if you want to go back --> to the slides after this is over feel free and we're going to begin so can you open your firefox --> browser and the first thing we're going to go to is osent overview wikipedia are you able to see --> okay it's not allowing you to type uh give me one second you can see my screen okay i'm going to --> if you give me two seconds i'm going to reach out to it and see if they can get that address --> um are you able to read from my page for the moment while i get in contact with them all --> right so will i wait for them to give me a response um i always try to give people resources --> that they can go back to after our training that i allow them to continue to use and engage and learn --> so the osent overview here in the wikipedia is always a great source because a thing that's --> always happen is that this information is always changing it's always growing it's always involving --> right so as new things occur and new instances exist people tend to come back to this wikipedia --> and update it so as you can see we go from our categories right of things that can be used as --> open source uh methodologies how it's defined as a whole and through different avenues and structures --> and it also provides references right so it gives you the spaces where you can go --> and get more insight if it's something that you're looking for that isn't here another tool is --> cali.org here is the the website for cali linux here you're going to be able to find all your --> documentation for your tools sorry let you go to that one okay good so here at cali you go to --> documentation you can go to document tool documentation so if you go down to here what --> you'll see is every tool available in kali linux is available here and if you look and if you're --> looking to use a tool you click on it and it gives you all the information and details on said tool --> gives you all the meta packages that come with the tool it explains how to use the tool --> It provides you every piece of information that you would need to function through the script that you're using. --> So another thing that we use is Shodan.io. --> Okay, so it's all you can type now. --> It won't allow you to type. --> Do you by chance have another browser that you could use? --> maybe a google chrome browser or a mozilla firefox okay okay so then we have shodan --> so shodan is the search engine for everything right we talked about how essentially you can --> use this site and you can enter a domain you can enter a person you can enter whatever it is that --> you want to be an indicator it'll create a map of everything that thing is ever engaged with --> then we have who is um i'm emailing with tech support now give me one second --> and then if you go to your cali you should see exploit db and what you'll find is like you have --> your exploit database so sometimes we can find a significant enough of a vulnerability --> in open source on the open source environment that it requires us to also add it to the exploit --> database so it's something else that you would use to cross-reference the data that you --> that you receive to know that if any of that would be vulnerabilities for any of these as well --> all right um so while we're waiting on them how how how are you as a learner do you like to scroll --> through the the slides do you like to be hands-on do you like are you a visual learner what works --> best for you i'll bring this down so if you go if you want you could go back never mind i was --> gonna say you go see what more ego looks like also all right so so waiting for a response from them --> i'll be able to all right so in this module we will examine the ecosystems from which osin is --> collected not just the tools but actual information and sources right open source data is vast and --> it's disorganized so the critical task of the investigator is not just to find the data but --> you have to assess its origin integrity and legal standing we're at slide four i believe so we --> classify oscent sources in the four primary categories right you have your web platform --> and your domain records you have your social media and your identity footprints you have your public --> records and your legal databases and you have your underground and edge communities right so --> so that's like your forums your dark web that's your breach dumps and each of those layers offer --> like unique signals risk and evidentiary value right so we begin by mapping the landscape right --> so web-based OSINT refers to both surface content right that's the websites the blogs that's the --> products that's the things that you digitally can see and the metadata so the dns records the --> server headers, the SSL, and the search, right? --> Things you use on the back end. --> So does that make sense? --> All right, so these are the foundational sources --> for like profiling organizations as well as tackers, right? --> So your key resources in this is always going to be, again, --> who is lookup, right, for your domain ownership --> and your registration metadata. --> You're going to have your DNS dumper for your DNS mapping --> and your domain discovery and Google darts --> for exposed documents misconfigured servers and staging content so so a lot of a lot of --> breaches can occur not just because a person took the time to booth was a password sometimes --> there's a four four uh code or or various other codes that are left unaddressed by developers and --> that can be a weak point for security as well so a high value and high value of that and high value --> engagements domain osin often identifies legacy assets and abandoned infrastructure often left --> unmonitored and unprotected right so with that comes like just the idea of social media as an --> investigative surface right so you're talking about social media just being a rich vector --> for behavioral relational relational and reputational intelligence right it also presents --> like legal complexity right so when you start talking about like platforms like linkedin for --> employment mapping and organizational hierarchy we start talking about companies like like x --> that has uh uh enormous space for network analysis threat active signaling and then when you think --> about instagram and facebook you think about geotech and event forensics lifestyle inference --> right so when you're looking for something that's going to be employment related you're always going --> to want to start with linkedin for your open source if you are looking for something that happens to --> be um social identity related you're going to start with twitter x facebook and instagram --> as these are the ways that people psychologically navigate social media does that make sense --> so your key considerations in this is that you're going to want you're going to want to cross --> reference usernames and handles right you want there to be a name and if you want the user you're --> hoping for the username and instagram and facebook to match the name on twitter as well right because --> then you have an attachment in the indicator right you also want to identify being reused bios avatars --> or hashtags you also want to be able to extract like what's called exif or exif data from like --> the posted images those the exf5 data is what consists of your metadata that's going to give --> you your location the person device and things like that matter so for example my company has a --> program uh has a product that we sell called signal protocol right and it's a proprietary --> cultural threat modeling toolkit built to address misinterpretation of social cues and osin right --> based on language based on region and subcultural differences right we use this and created this --> concept because when conducting ocean on public social platforms cultural fluency is just as --> important as technical skill right there shouldn't be a reason why something misidentified in one --> culture ends up being a consequence for someone in another one that the identification and --> reputation of it is completely different right so we use open source intelligence to not only --> just get this metadata and not just to get this layout of this of this space but we also apply --> like the cultural aspects of it as well so that there isn't a miscommunication in spaces so --> unlike with media and social data public public records are often archived authenticated and --> legally admissible right where we talk about the social media and the cultural differences --> and things that could kind of make a case hard it's completely the opposite of that with public --> records because this information is provided in this true format and is and is essentially more --> time than not signed by somebody validating the information to be true right so we start talking --> about like the archive authenticated and legally admissible information that's including your state --> corporation uh your state corporate registries right that's used for like your company research --> you got your court documents and your fallons so you got lexus nexus you got pay you have pacer --> you have your local court systems you have the probate system and various other things that's --> available to you from the court system you have your freedom of information act archives and your --> regulatory protocols so talking about your fcc your sec your ftc right or your your your --> sanctions list and your watch list right so your ofac your un and your interpol so these records --> can confirm identities i'm sorry so these records can confirm identities establish timelines and --> expose inconsistencies and statements and compliance claims and the forensics first osa --> analysis always seeks source illegitimacy before a data quantity so it works so now uh --> now that we've come now that we've passed through your state-based implement public information --> now we're going to get into like your forums your paste bands and your breach dumps right --> while often associated with the dark web many breach discussions take place on index --> index platforms right so your index platforms are including like paste bin and that's a --> space where you can just you only thing you can do is just paste data paste data --> paste it in the forms of pictures but there's metadata that's collected from those pictures --> and that's how people pass information right you have your breach forms on the dark web you have --> your telegram channels and you have your reddit communities right all of these are considered --> index platforms because albeit they don't use traditional forms of securing their information --> it's still stored in the way where we gain access right but the conflict is that these platforms --> usually contain leaked credentials infrastructure notes exploit timelines inside threat trap and --> inside threat inside threat chatter right so you never ever ever want to engage or attempt --> credential use right like you only ever want to observe what's happening you don't ever want to --> you never want to take that name and put it into something trying to attempt to model or mimic that --> item or entity so when we start talking about the osa methodology and framework --> right a tool set is only as effective as the methodology behind it --> right it's no good to have a wrench if you use it like a screwdriver --> so um we teach that the following three phases we teach the following three phases of the osa --> workflow right so the first one is information gathering right so in this space you want to be --> able to define your scope right what is in bounds and what is out of bounds right when defining your --> scope you always want to come to the understanding of what is what is it that the client is giving --> you access to what is admissible what is acceptable and what is illegal right so in defining your --> scope it's going to it's going to give you the understanding of what social media sites you can --> use why you can't use them and then from there you're going to go for your identifying your --> sources right so which platforms which tools which databases are applicable and then the execution of --> that data that's poor from those spaces using valid validated osin tools right so it's not so --> it's not something as simple as copying and pasting but the information has to be retrieved --> in a in a way that also allows you to hold the integrity of chain of command right so that's --> going to be your information gathering phase right do your analysis stage right when you're --> talking about correlating organizing data from multiple sources that's going to be like when --> you start deduplicating right so sometimes you're going to come across data that's going to be the --> same across all platforms and you only need to store that once right and as you do that you're --> going to come up with one complete map of a person right so that's going to help you correlate --> data points right it's also going to help you find like the anomalous behavior timelines and --> inconsistencies right we are pretty autonomous people or we're much more autonomous than we like --> to believe right so we usually find ourselves using the internet at the same time during the day --> we go through a schedule at a certain time of the day and any type of out of normal interactions --> can be a indicator or a sign of an intrusion or things occurring right so in all of this you're --> going to be a client cross platform logic right you can go anywhere from an email to a domain --> based on how that email that that the server that email is attached to a domain that match that that --> domain that that email is attached to you'll then go from that domain to a person and then you'll --> go from a person to an infrastructure right so all of this data you're going to collect in the space of --> this is going to go into your reporting right and then presenting your findings in a clear and --> actionable manner but your document found is with the meta that's going to include your document --> findings with metadata and timestamp evidence that's going to be your structure in a manner --> admissible and legal and regulatory frameworks and then that's going to be using tools like --> malt ego to help people visualize the connections versus providing them data dumps so it's not just --> a collection out of exercise it's about narrating the construction with forensic depth right you're --> not just collecting data you're telling the story so some of your key reporting consideration i'm --> sorry some of your key reporting considerations and osin engagement is going to be when producing --> the osin report like you have certain components that are essential for the professional grade --> output right so again you're going to have to have a source chain right that's where you track --> each piece of data originated that's where you're going to have your time stamps that's going to be --> where your document discovery and time your doctor your document discovery term versus your --> publication term exists right you're going to have your corroboration you're going to have your --> space where you provide at least two independent data points for key assertions you're going to --> have your evidence package right that's going to have your screenshot your export logs you're going --> to have your pdfs and your structured data and then you're going to have the executive summary --> presenting the finance for a non-technical stakeholder right so you're going to you're --> going to compile your information build it into a way that the person that you're presenting to --> is able to understand the outcome that you were intending in your investigation so --> in this next activity right what we're going to do is we're going to execute a compact --> investigation using public tools right so we're going to use in this next example we're going to --> we're going to choose a a public company and we're going to engage in osin using the things we've --> talked about so do you want to do that along with me awesome all right so what we're going to do is --> we can choose between we can choose between i want to give us three companies to use --> we can use oop we can use uber we can use spotify or we can use razer the computer company uh-huh --> we use one of those three uber all right so we're going to use uber all right so --> you're using uber so open your browser so we say our target is uber the first thing we want to do --> is go type in who is look up so we're going to click who is ip search and then you're going to --> type in www.uber.com and as you can see provide us all the public-facing information about uber.com --> so here we see that we get a domain name we see when it was registered --> we see when it was expired we see when the last time it was updated --> we can see all of their statuses and now we see all of their domains --> right here see that information there so those servers are used as critical information when --> you use say for example nmap right you could put this domain into nmap and it would give you --> a structural layout of all of the ports available and accessible to that device depending on the --> security measures it has installed it would also be able to tell you the operating system that's --> running that device it would be able to tell you what ports aren't available to that device and it --> would tell you if you can communicate with it or not we would take that information and then we would --> put that information into a exploit database and then that tells us the vulnerabilities available to --> it so from the from the who is space right the next thing we do is we're going to pull up another --> browser right beside it new tab and what we're going to do is you're going to what's considered --> google dork right so google dorking is using google to be able to find information publicly --> that people wouldn't know is facing the public so what we're going to do is we're going to the site --> and then we're going to put uber.com and then file type oh i'm sorry go to www.google.com --> first and then inside of google you want to type site uber.com space file type pdf and here it shows --> you all of the pdfs available on uber's back in some of these things we're supposed to see and --> some of these things we aren't supposed to see right so this is why we say like you don't click --> you see the information you screenshot the information document information right so from --> here you would then go to uber's facebook page you would then pull uber's linkedin profile --> so based on these things here you will begin to create a profile so let's say for example --> you're looking to create a job right and you're looking to see --> if any of their employees say for example right let's say what we're going to say is that uber --> has a breach and they're not they don't know if it's external or internal so our job using open --> source intelligence is to determine who what happened and how it was done right so we'll go --> to linkedin and we will see the employees that uber has for example right and then looking at --> the employees that they have we would then cross-reference those employees to see if any of --> them have any cyber security or programming experience or or skills right that per that --> that will become a list of itself right we would then go through uber and then we would get a list --> of complaints or issues against employees that would be a list of itself we would then go through --> uber we get a list of employees and then there will be a list of people who um may not have had --> an issue or a write-up but they might have written something out of this space for the company --> right and we use this to create our just our psychological psychological profile in regards to --> how did this thing happen or how was it engaged with right at the same time we're doing that we do --> we go back to those name servers that i just showed you under who is and then we use these --> under dns and dumpster for subdomain mapping right and from there you just document your --> founders in a short bullet point format and then you're done kind of makes sense for an overview --> because i don't want to go too deep deep into it because we're going to walk through it as we go --> along all right so so in this right um going back to access with google dorking right in the space --> of google dorking you want to be conscious of how you how you set up and how you engage with the --> internet right basically because we run into this space where programmers are provided a sense --> of security when building certain aspects of their structures or in moving through a space --> in a certain type of way you kind of forget certain things and the end result is the information --> that we're looking at now right so in the process of this this can be used as a way to help your --> clients better protect themselves and to know where security leaks exist right so for example --> let's say if i click here right only because i have a bug bounty with uber so we should be fine --> so say for example see how i'm clicking here and it takes me to bradford.gov.uk --> so this is a source that this is a source that uber is using in regards to the information that --> they're providing now there's a great chance that depending on these two organizations and how they --> how they transfer information there will be a weakness between the two see how long it's taking --> us to get access to the network so this right here would tell you that there would have to be --> something faulty going on in the background so this would be something that you would document --> and then provide to the client based on open source intelligence. --> That's another thing I'll show you. --> So also with Google Dorking, it's not just PDF files. --> It can be CVs. --> It can be any file type that you can think of --> essentially can be tested against to see if it's publicly facing. --> So there's times where you might not find anything. --> There's times you might find everything. --> So the security is, seems to be tight on all of them, except they have PDFs. --> I have to put the period, but everything else is essentially locked down and secure. --> So for example, what information based on what we've talked about in open source intelligence, --> you think you could pull from their facebook page searching across facebook's uber the uber page on --> facebook what information do you think would be important to gather for an investigation what what --> information do you think you can get from their facebook page yep that's one images are a good --> source for metadata what's something else you could use yep you can look through the comments --> and you can see if there's anything that anyone may have referenced that may correlate with what --> the breach or the confidence the client client may be having uh what's something else you could get --> from there yep you can use their followers you might be able to scour the followers and find --> names that correlate across russia with something that you found in linkedin or somewhere else --> yep uh are you savvy with developer tools so for example for most devices if you press f11 --> or f12 what will happen is it'll show you the back end code that's running that page --> that's another source of open source intelligence that you can use --> that will give you information about too more times than not you will find the emails of --> developers there. Sometimes you'll find notes left behind that give you --> clues about how you can kind of maneuver --> around the site or its structure and the source. --> This is just to show you that Facebook is a great source for --> open source intelligence. It's not about what you're seeing, but --> it's about the map that's created from it. --> Now, going to LinkedIn, what things --> do you believe you could use from their LinkedIn page? Yep. So let's say you take the employee --> list from LinkedIn and then you cross-reference that list against all of their posts they've made --> in the last five years. The end result would be that you would have a map for each employee --> and the things they've said over the last five years. And that would give you an idea of where --> to begin. Even if that person isn't the end result, they're potentially something that could --> give you another piece to the entire puzzle. So using who is and the information that's presented --> to you, how would you use it? Uh-huh. So let's say, for example, right, you have your name service --> here, correct? An example would be a site called named graynoise.io, right? And what you can do --> with gray noise is those domains that you got from uber you can place those inside so when you have --> that more times than not it's going to be behind a a server right i mean it's going to be behind a --> security wall or security monitor so you would do that with each one of those ips with each one of --> those subdomains i mean it's one of those domains attempting to get access to an ip address --> so let me i'm going to find one that does have so for example let's say if i did --> google right so now you see that you essentially get a semi-threat map of all of the ip addresses --> used you will see the ones that are good you will see the ones that are malicious and you'll see --> the ones that have been doing suspicious things right so that's what will come from your --> information using who is right and then the information provided here from who is is what --> you're going to do like your nmap scans and a lot of other detailed pertinent points within like --> you're using your methodology to find out the information you're trying to get so see how like --> all of this information is provided to you without having to actively engage with the --> the the intrusion this is how open source intelligence works all right so based on the --> information that you see in the query results what do you believe that are things what do you --> believe that i implement what do you believe is information you could use in the rest of your --> investigation uh-huh yes with the information provided to you on gray noise --> what of the information do you believe could be used in your investigation --> let's say for example google is your client right and google wants you to be able to tell them if --> they have any threats within their organization on their network right so let's say instead of --> doing who is for uber we did who is for google right and based on the domains that were provided --> us from google it presented us the information that we're looking at now right so to your left --> you have your classification for all of the ip addresses that's available to us it's showing --> you which ones are malicious it's showing you which ones are known it's showing you which ones are --> benign haven't really been doing anything and which ones are suspicious right so based on that --> uh key what information do you think would be important to your investigation so if you --> look at the ip addresses what you'll notice is that you have some that are green right they're --> benign so that means that they've been around for a while they don't have anything weird going on --> with them you have something you have some that are yellow that are suspicious and you kind of --> got to keep a good eye on them and then you have some that are red the red ones indicate the the --> ip addresses and the domains that have already done malicious activities so you would you would --> you would catalog all of them you would classify them based on their severities and then as you --> would go through your rest of your investigation you would find that let's say for example you find --> nothing from your suspicious but all your malicious but then one of your benign or one of your good --> or your greens actually ends up being the faulty avenue right so in open source intelligence we --> still collect all data even if it doesn't look like it might be of value because in the end --> result it could very well be does that make sense okay so is is it is it is it sticking with you --> right now are you able to stick along what we're going through okay all right so another cool thing --> about gray noise is that not only does it provide you the classification but it'll actually go through --> and give you like which ones are false it'll tell you where the source countries are for these --> it'll tell you where most of the communications are going to as well as like the tags that are used --> or were used to find it right so open source intelligence provides you with way more information --> than you could ever imagine right so for example if you find the yep six yep what you'll find is --> another layer of information right you can find the timeline you can see how that how this ip --> address is navigated across the network you can find the the ways in which the information was --> found you can find our dns information and everything else that you would need to begin --> the scope now this information here to your right this information is important because let's say --> for example you log all of this you now have locations you laugh you also have not only do --> you have locations but you have locations based on ip addresses and enough of various sorts that --> let's say now let's say you have a hundred thousand pictures in pace bin that you need to go --> through you can have the indicators for these locations set within your within your search and --> your paste bin and it'll pull out all the pictures from these locations does that make sense so --> you're using open source intelligence to continue to peel back layers based on the information --> not requiring authorization and verification so i want you to follow me on this one oscent --> encyclopedia so it should be the first one to talk so this is a resource i found that i go back --> to and a lot of professionals i know uh basically what it is a checklist for everything oscent --> related right so following this here guideline generally puts you in the space of effective --> layout for what you would need and resolve for let's say uh expert witness statement right so --> it takes you to each structure and breaks it down to a point where you can identify --> basically a walkthrough you should have a appendix at the back of your student manual --> and that information should be in it do something specifically so here we are right now at dns --> dumpster right so again back to uber my name server and now we have the location ip asn --> information on the server so now see how we have the ip address and see how we couldn't get the ip --> address last time so now we have a back door right so you want to take that ip address --> then go back to gray noise it's there and now you open up another layer that wasn't there before --> so i wanted to show you that way because some just because you run into a broad block one way --> doesn't mean that there's not a back door to another make sense okay so in this session right --> here we've learned about how to structure osin investigation around authoritative data sources --> right you've also identified four major sources right we've practiced tool-based data gathering --> and we reviewed analysis and reporting best practices right so the idea of best practices --> is that you want to remember we talked about how how you gather information what information is --> acceptable and what isn't and stand within legalities those are your important best practices --> right so now we're going to start going a little bit more in depth with the tools and we will --> start doing like elaborate demonstrations with showdown or ego and like we're going to start --> doing a little bit more advanced google dorking with cali that work for you so we've been doing --> this for about an hour and a half now do you need a break or i need a short like five minute break --> if you need it okay let me know you need a break and we can do a little five ten minutes or not --> that work for you all right all right so so you understand what and why osin is now at this point --> so now we got to really go more into detail about the how so let's say for the next hour --> we're going to focus on the tools that's going to transfer this raw open source data into actionable --> visual and validated intelligence right all of these tools we brought to cover are going to be --> available in cali linux and it's going to be available to any digital forensics and penetration --> testing distribution right that's used for precision precision investigations right so --> these two the day are going to be financially across what's considered difr so digital --> forensics and investigate uh incident report that's going to be used in red teaming legal --> investigations and breach response right so now we're going to go back to like one of the most --> common and deceptively powerful tools on the osin right which is going to be google doors --> so like as i already explained to you we've seen an example like it's just an advanced search --> operator that allows you to extract like public index information that's going to be more times --> not overlooked or kind of unintentionally exposed right so you can have a situation where there's a --> developer who's just moving way too fast right and he might accidentally not secure something --> the way it needs to be then you're going to have an instance where an employee is going to know --> information is critical and they may want to have access to access to it outside of their office so --> they leave it in a vulnerable way where it's publicly available right so some of your most --> common we go back to google now some of your most common the most common forms of google --> talking is going to be like for example you're going to have five type pdf right and then the --> site could be for the government right let's start there now all of a sudden you realize that --> every pdf for the government is now accessible right whether it's supposed to be or whether it's --> not right you can you can scarily do in title right in title then you can do index of --> index.of --> then you can go space and do password and now you see all these passwords that are --> publicly available on internet that we shouldn't be seeing you can use in url --> and then login and now you see that it provides you with all logins available under that name --> so the logger that we're looking at all of them aren't going to be something that a client wants --> to be publicly facing right you really wouldn't want your member access page to be publicly facing --> in the context that anyone can get to it you would want it to be in a space where --> only those who need access to it can gain access right so um it gets even crazier right you can go --> on and you can say for example you can say uh let's say we'll stick with google because google seems --> to be a good one right now so google.com and then you can say confidential and then it's going to --> show you things that may or may not show up confidential so for our example right we're --> going to do site it will say justice.gov right and then for the file type we want to try excel --> see if we find anything oh they didn't tell them man these things do not give up i have no idea --> uh just to keep you from fighting with this this reveals like this just will reveal any --> exposed excel documents within the u.s department of justice domain so like while most domains are --> public some may reveal internal structures outdated contacts and poorly protected internal data --> right so then you got to understand to be conscious about that what this is that these are considered --> passive queries right when you start going into the xls of it or like actually engaging with --> those files it's what's called uh active querying right because you're viewing index content that's --> not penetrating in the system right so when you start to get to the point where like say for --> example you hear about people doing what's considered web scraping right like you can --> automate basically if you automate scraping or if you fail to observe like the terms of service of --> a site you can also cross legal boundaries in that too right so let's say for example google --> dorks is considered web scraping right because you're piling compiling a bunch of information --> into one space and then the way that you're engaging with it is one where like you're just --> basically just pulling the data that you need it's just like pulling from later that metadata --> right so sometimes the services they specifically didn't they deny doing that right or they they --> strictly prohibit you from doing that and using that service right so now that we move from google --> darts i think we'll get to what you kind of wanted to see a little bit more about which is showdown --> all right so we go to www.showdown.io so unlike google which index websites right --> showdown indexes internet connected devices so this is this concludes anything from cctv --> cameras and routers to like industrial control systems and medical devices right so in the --> situations where these would be used is right in the cases where you would use shodan would be --> like to find misconfigured iot devices right if you was trying to identify certain type of systems --> that are exposed to the internet you could use uh shodan.io to track vulnerable servers and like the --> known cve so those exploits i've shown you in that database and you can also use it to understand --> the organization's attack structure right so you can use showdown to see how people communicate --> what they communicate with and the end result would be understanding like the basically the --> integrity and the score for their security right so one of the examples of a query and --> side of showdown would be webcams right so you could do also you could do ssh --> service you could also do exposed devices like to a link to a specific --> organization right so for an example I'll go in and let's just say port three --> three eight nine city Arlington I know I'm trying to figure out why not --> I literally just used it this morning. --> Let's see if I try somewhere else. --> They want me to log in. --> That's what it is. --> I just want you to be able to see the tools. --> For example, if you see how I query camera, --> to now see how it shows you a complete map --> of everywhere across the world, --> how those things engage. --> Let's see if we pick this IP address. --> Now see how we see every active port --> that's available to it. --> so see how you can see all of these active ports here all right so this is --> the information that you would take and you will put in let's say for example --> in map right so this is where like the this is where the when I was asking you --> about knowing about Linux comes in it right so all right so if you go to your --> if you look at the top of your screen at the see that black box at the top the --> the console uh-huh right beside the five yep the terminal so what you want to type in there is sudo --> s-u-d-o uh-huh space zen map z-e-n-m-a-p --> enter so now see how you have that ip address at your left here --> so you want to take that IP address and put two seven dot two or seven nine dot two two four dot --> twelve and and then you want to go to intense scan no ping but you have to put your target --> first oh oh it is okay and just like that you just started your first scan all --> based on information on the internet publicly available so now see how you --> have a list of all of these ports open so based on the ports open the type of --> system operating system that is using in various other avenues that's going to be available to you --> depending on what's found in this in the scan you're going to be able to go from here to the --> exploit database and find out which of these things are vulnerable to the new information --> that you have does that make sense so this source of engagement is used by excuse me is used by um --> cyber security professionals on one side but also used by cyber cyber criminals on the opposite side --> right and it's basically just doing enough research to get information to find out what's --> publicly available and critical instruction so now that you've collected all of this data --> is when we get to multigo right so now this is where you start now that you've gotten your --> information from gray noise you've gotten your information from uh your zen map you got your --> information from your social media platforms now you are in the multi-ego --> and you want to create the relationship between these things right so see if I --> can this is why I used up go do you mind watching a short brief video or --> multi-ego just to give you a better understanding because it's not gonna --> me use all my features on here so i'm gonna find the video but in the meantime just for like a --> couple of seconds this is gonna give a little breakdown on it so my ego is an open source --> intelligence and forensics application it will offer you 10 years mining and gathering of --> information as well as representation of this information in an easy and understandable format --> right so ego is right there and that's the that's the one second one second i gotta transfer this --> video over let me social one second i got a better chance --> so if you want you can go to the top left and applications click more ego it's not gonna let --> me get around paying for it so basically what you would go in here and do is once you populated --> all the information that we found based on your dns dump all the information found in your --> in your cve the other information you found in your query all this information will be --> compiled into your mall ego and then there it would cross-reference all the information --> it would create a map for you so that map would look like this for example so what will happen is --> you will have a map and you will have your centralized point and all of your indicators --> which will be your ip addresses your sub domains any other activity found would be addressed and --> then you will have a legend at the bottom right as you can see so that's what that's what more --> ego gives you it just creates a map of all your data in one centralized place does it make sense --> to so like let's say for example right the way the steps you would take in that in malt ego --> unfortunately i can't show you right now but you would resolve the ip address right so you --> would determine that that's the specific ip address then you would discover all it would --> then go through and then it would discover all the linked domains to the ip address --> it would then go through and find out to the ssl certificates or the security certificates for --> those domains and then it would pivot to like the social media references for those things --> so by the end you'll have like a spider web diagram with all the relationships and it's --> invaluable to both the attack simulation and the breach reconstruction right because as you can --> see like the colors allow us to see how the things transition right so it goes from a deep blue --> to a light blue to a purple right and it's showing us that the thing has navigated within a space --> and then it attached to that information it changed as it went to a next phase right so all --> of this is information all this information is important when you're doing open source intelligence --> because it fills in the gaps that maybe a client isn't able to provide us so now i'm gonna go a --> little bit more into detail about who is right so with who is right basically it's a protocol --> that allows you to extract the registration metadata for domains right it is going to include --> your your owner identity registrar your creation dates and your name servers right now because --> certain states and certain industries have different laws there are times when you are --> running to who is and it won't provide you with a name it'll be hidden behind a it'll be hidden --> behind a paywall which is what we call right so you wouldn't you wouldn't essentially be able to --> know who the owner is because it's registered under what's considered a registered agent --> right and depending on let's say for example the state of wyoming if it's a member managed system --> then there is no understanding there is no identity for the owner of the company because --> it's only based on the member does that make sense so it's just something to be aware of --> in your who is analysis in case you came across a space where you want to see who the owner was --> and it said go daddy or you went to look for a location and for example it said saint kits --> or it says switzerland because those are two spaces that don't actively engage with america --> on providing cyber security information so let's say for example one of the training --> one of the training um urls that's set up specifically for use with uh who is before --> or one of the spaces that's set up for training purposes --> is called suspiciousbankonline.com, right? --> So, suspiciousbankonline.com, so it should be, --> hold on, give me one second. --> That's strange, just the other one. --> Give me one second, give me one second, one second. --> currently looking for a site so you can get a more in-depth look at this you say --> you need the bathroom I mean yeah we can take a five minute break if you want to --> we're ready when you get back oh no I'm fine so you're 22 minutes I'll be here --> when you get back you're welcome all right awesome so if you notice I moved --> from who is to I can't it's just an alternative just what I'm tired of --> fighting with the the captures so if you want you can type in at that search you can type in i can --> i-c-a-n-n lookup uh so it's just so just type in i can dot org yeah huh so if you yeah that one --> uh-huh yeah you can click on that one so go to uh type in at the top type in lookup dot i can --> dot org so type in hack this site.com so go to i can look up so then you type hack this site.com --> where it says inner domain all right so now we see that this we see the same information that --> we've seen on who is right so we see name of the site we see the registry domain id --> we see a status we see that we see it's two servers right so in the osin search what we --> do is we would take this information again right we would take what your registrants are --> mailing address iso code administrator for their email so see how we have all of this --> pertinent information about this site now you got your raw your raw responses so your raw --> responses this is what you would store this is what this is what you would be logging right --> that's going to give you your your object name right which is a domain it's going to give you --> the handle that it goes by it's going to give you all of the like the raw reference material --> that's specifically identifying for that instance does that make sense so we take this information --> and then we go to your dns dumper right so again take your name server copy it go to your dns dumper --> so see how there's nothing showing for it so it's just not an ip that you can show --> right so let's say for example from there say you want to go to the site right you go to hack --> this site.com that's not true hold on so now i gotta my apology that was on me so now we gotta --> go back and we gotta do dot org right so now we get a completely different uh-huh so it's it's --> dot org not dot com uh uh no it's dot org org that's correct so now we go back to i can because --> i was i was wrong on that one so we go to i can and instead of dot com we put in dot org and then --> we copy this domain name and then we go back to dns dumper yep your domain name your domain your --> name server yep the first uh-huh yep and we start our search so on my side so you may have to refresh --> the page first and then try it again so now see how we have an address for this server we have an --> ip address for this server and we have all the information that we've been provided so from there --> we want to go to great noise search for free and then the ip address that we received from --> here we want to go and we want to put in the query we want to go over one more so instead of being --> here go here on your side it should be here so you see where you have the see where it says --> gray noise at your top at your tabs to the left the one beside that one opposite direction so --> move to your left two tabs at the top see whether you see it see where you see the ip address beside --> gray noise you can type it in there and it'll do the same thing mind the story so now it says --> further investigation recommended so now that lets you know that you're not the first person --> who came to this ip address right now from here you would go you would log this information --> screenshot and then you would take the name servers that you got and you would add those to --> showdown and then on showdown it will show you basically an entire map of everything that --> that server or that ip address is engaged with so all of that information collectively together --> goes into mall ego and that creates your your model and then your model is what you write your --> report to does that make sense so based on the website itself right and the information that's --> been provided from icann dns dumpster and query and the query from gray noise what information --> would you say you have collected so far if you were walking through the steps that's been provided --> to you so far how would you go about doing this how would you go about how would you go about --> acquiring this information based on the steps that's been provided to you uh what would be --> step one if you were working on open source intelligence what would be your first step --> using hack this site so you have the name of the website and the thing that you want to do is you --> want to begin doing an assessment on it so now that you have the name what would you do with the --> name where would you copy and paste it who is and i can so if you want you can you can --> you can open up a notepad you use cherry tree it's a pretty simple one and --> so when when you're starting to open source like open source intelligence search what you want to --> to do is first thing you want to do is information from client right so and gain the information from --> the client they're going to give you the ip address they're going to tell you what information they're --> looking for they aren't looking for and then you go from there right so you would take your --> you would take that the website that they've given you and all the information provided and --> you would go to who is right and you would go into that into who is who is and i can and then --> what would happen is that it would provide you the name servers make sense all right so from --> the name servers you want to be able to go to dns dumpster right because that dns dumpster is going --> to take those name servers it's going to provide you the ip addresses that's utilized with them --> right so you're going to get those ip address once you while this getting those ip addresses for you --> you're going to also then put that same information into showdown right and while that's --> compiling build that information for you you want to go to gray noise right with the ip addresses --> and the information that's provided to you and from there it's going to give you one of your threat --> maps right so in the breakdown of all of this information by the time you finish you'll have --> the name you'll have the domains available you'll have the ip addresses available you'll have a --> social media layout for anything is interactive or navigated right and that's going to be the --> that's going to be the completion of your oscent methodology but also at the same time it's going --> to begin your actual assessment right so the information that we're communicating right now --> it's the surface level right because it's providing you the details and important information --> you need to deep digger or dig deeper right so your domains once they're put inside of a harvester --> then will provide you all the sub domains and all of the right directories inside of that --> server right so it's going to go from being uber to uber accounts to uber uh accessibility --> and all the various avenues and components within that site right from there you're going to cross --> reference the social media information that you found as well as the information provided from --> the client to find any anomalies within how that system normally works right so at times you can --> get access to what's considered event logs right and that's showing you every single instance that's --> been engaged with on that device right and in the instance of the device logs let's say for example --> this person is on normally whenever they're sitting in front of their computer they're --> they're making 42 different engagements within three minutes in front of that device right but --> for some odd reason when you go to look at the logs it's only showing 41 right that would indicate --> that someone's been manipulating the logs right so that would be another indicator that there's --> something that's going on that's going on and you add to your assessment does that make sense --> okay all right so now that we've gone through --> now that we've gone through who is and more ego now we're going to come to kind of went through dns --> dumpster but i'm gonna go a little bit more into detail with that right so dns dns dumpster --> it's really just a free tool for discovering subdomains like your mx records and like your --> dns configuration right so it's useful in like mapping out organizations web infrastructure --> footprint so as we go further down you can see that you can download the the excel for this --> but you also get a map you also get a map of what their layout looks like right so we can --> see that the ip address that we're looking at right now is attached to this name server which is --> attached to this device which goes back to this domain right you have your dns so yeah you don't --> have to yeah you don't have to use it if you scroll down a little bit further it'll show you --> so like if you see now like it's it's just the the excel is just this map that you're looking at --> and block form so but with this right here it's showing you the layout of the of their setup right --> so you go through and you put each dns server i mean each name server inside of here and then --> you'll get a even larger map right so in these things right it's it's usually used for like --> to help identify forgotten staging environments right so let's say the dev team came in and they --> had to build a new component for the application right and in doing that they not so much rush --> through but on their cleanup they didn't remember to disconnect a few things right so now they're --> just open ports now you're giving a person the opportunity to continuously breach or attempt to --> breach this space right you can use it also to find like low you can use it to find login portals --> like admin consoles because let's say most sites have an admin console but how you get how you gain --> access to them looks different so you would use dns dumpster to show you the actual ip address or --> the actual domain name that points directly to that console and you can use it to catch what's --> considered like s3 buckets or like misconfigured cloud environments so only thing worse than having --> a misconfigured personal environment is a misconfigured cloud environment because the --> cloud environment essentially never shuts down and never turns off and it's always giving someone --> access unbeknownst to you if it's not done correctly right so so like as we've already --> seen in the analyzation that we did for the site that we're looking at now i hacked this site --> what we found is that we discovered the subdomains right of cmsbuddyms.com right --> so in that subdomain we found that the mx record showed us how emails are being routed right so --> this is showing us that all the mx is being held here completely different from their --> average for i mean average level of access right so it's showing us that like it supports like this --> type of information that is showing us right now it not only supports like breaching simulations --> right but it's also going to support us in like red teaming so we could take this information --> that's being presented to us right now and based on the methodology that we've worked out with the --> client like this would give us the access that we need so now we got to get hands on right so now i --> need you to pick a site and then we have to run a real-time oath and investigation using the tools --> we've covered so far that work for you all right so what site would you like to use --> you can use it you can use your school you can use anything that's close to you --> anything that you think would be interesting no more information about than you do now --> starbucks boom let's go let's go get starbucks so what's the first thing you would do --> and once you know what starbucks website is what would be the next thing you would do --> you would take next you i'm sorry uh so what we want to do is we want to first first so if you --> want you can close out all of your tabs at the top we're going to start fresh so we're going to first --> pull up starbucks website the next tab in the next tab at the top we're gonna open up and we're --> opening we're gonna open a new tab and then we're gonna open up starbucks facebook page and now --> we're gonna open up facebook we're gonna open up starbucks linkedin you can open up a new tab --> and now that we have our base information right we have our social media platforms we have our --> website what would be the first thing you would do let's say for example what would be the first --> thing you would do with starbucks.com and what website would you use to find information on --> starbucks.com first yep uh-huh so think about it like this right the first thing you want to know --> is who is so the first thing you go to is who is or i can right because those are always going to be --> the initial things you can do yep so if you go i think if you type on the who is --> so if you go to the top type in who is look up in your search should come up properly the first one --> i'm not sure why it keeps redirecting us to this weird one for some odd reason --> so now we put in www.starbucks.com here all right and now see how it's provided your information --> again so now i need you to yep so now you copy that and you place that in which tool --> yep dns dumpster now that you go to go back to your dns dumpster and now that you scroll down --> and you see what information from your dns dumpster are you going to use next and where would you use --> it yep and where would you use it you can use it in showdown but you can also use it in gray noise --> uh-huh yep all right so now see how we have this ip address here --> all right so now what we want to do is based on all the information that's been provided to us --> right if you want you could also go in and you could go to google and you can google dork --> google dorks starbucks to see if they have any pdfs out there you can go out there and google --> dork see if they have any excels out there or the likes right so we would go i'll type pdf then you --> would do company starbucks and that's going to show you each of the pdfs revolving around them --> so now the question that we want to ask is what type of report will we want to write based on --> all the information that we found thus far right we want to provide names we will want to be we --> want to provide names to servers say one more time i'm sorry so we would compile all the name of the --> client we would compile the information that the clients provided us which is the name of the --> website we would then begin to record as we found data right so we would go from our name servers --> and we would collect our ip addresses we would collect our mx records and everything else that --> would be permanent pertinent to getting a understanding and creating the actual map of --> all this information right so so and uh the next thing that we're going to work with right --> the your job is going to find at least one exposed document right you have to also present me --> one who is registrant summary you have to provide me with the ip address and then you have to give --> me a sub domain tree does that make sense all right so do you want to go through and show me --> how you would find that an exposed document using google dorks so what you would do is --> you would start with the you could either start with the type of file you're looking for --> right you could go file type then pdf and then and then you go to a site --> or you could do starbucks.com for example right now see how it brings up every pdf under --> starbucks's domain so now i need you to provide me with an example of you doing that so on the --> front end you're looking at it as as item by item right but let's say if you were using command line --> in linux then you would be able to give keywords to search through all of those pdfs for and then --> it would only give you specifically the ones you were looking for right so you would have like so --> you would have like buzzwords that you would like you would you would give it to search for and then --> it would just produce all of those back for you so you could so say for example see how you see --> on my side where it says site starbucks.com and then i go let's say strawberry right strawberries --> right i will put it in i will put it in quotation marks press enter and then it would bring up --> every time star every time strawberries references starbucks --> so you use okay so if you notice like the file type will indicate specifically only looking for --> that only thing or that only file type and like site will only scour that site for that thing --> so it doesn't have to be specific to an item it can be a word it can be a name it could be --> whatever you want it to be so for example i could uh-huh so let's say john so we're going --> to see every time the word john shows up in starbucks right there's a barista named john --> right there's a shift supervisor named john you know so do you see how like we're finding all of --> this uh-huh so let's say for example if it was about trying to use social engineering --> right now that i know john's name i have a picture of john now i can go to facebook i can --> find john right i can find out that john has a dog i can find out that john goes to the dog --> park every sunday and i can go and befriend john at the dog park so a lot of times and for example --> i don't allow any of my employees or subcontractors to post about our businesses --> online right that's by law or not by law but that's like a company requirement --> and that's because people can use social engineering to gain access unwillingly --> or unknowingly more so could you show me an example of using google dorks now --> using google dorks uh-huh just now yeah sure so first you want to find a site so first you want --> to type in site and then you want to press colon so s-i-t-e colon and then what's the site we want --> to look at for example right i'm gonna show you one i'm gonna show you a way to open source --> intelligence can help you in education right so site type in what's the academia.edu that has all --> the resource papers in it let's say for example we're going to do site we'll do academia.edu --> right and then we're going to say see it vietnamese right that's what your major is --> and now we see every vietnamese paper on academia.com that's been written so like open --> source intelligence could help you when you're starting to do research because it allows you to --> compile all that information that you need specifically away from all of the other --> information that's available does that make sense so that's all google that's all google dorking is --> it's not any sophisticated over advanced technique other than you saying site colon and the program --> knows that it's specifically looking for that site and then it's just looking for whatever --> indicator that you're given so if you look at last time we did it we did file type colon pdf --> right so now it knows only provide pdfs if we do file type colon csvs and it knows only to bring --> us that format of a file so it's just a nice it's a great tool to have like doing research to be --> able to just pull everything together and you can actually just continue to like chop the information --> down to the point where you'll just have one condensed version of everything you need --> right so let's say you start with just vietnamese and there's a certain part of vietnamese --> culture that you want to engage with you can slice the information down to that you can talk about --> specific space within that culture until you get directly what you need so now can you do it can --> you show me how you do it or how could you show me a way in which you would use it in your everyday --> life once uh see here's why it's scary using chat gpt right so the the way i explain it to people --> is that we're we as humans are still babies right and check check check gpt is a teenager --> right so as a teenager we've always wanted to be liked we've always wanted to be right --> we always wanted to be accepted right we always wanted to be able to engage --> and with that comes a little bit of faltering right so you could ask it a question the response --> could be other than what you're looking for but because it wants to continue to engage with you --> it's going to tell you it's going to tell you some matter truth right it's going to tell you --> maybe that's not the way you want to go and these are some other options but --> it's going to essentially begin to lie to you the longer that the prompt goes does that make sense --> so it's it's it's fine if you have a certain level of understanding or the ability to research --> after you gather the information but could you imagine having a 30-page paper that you rely on --> chat gtp about four only for you to turn around and realize that none of it has anything to do --> with the course it's impossible so is you always this is the thing i tell people that's the secret --> about chat gtp right what we think ai is isn't what ai is what we're what we're engaging with --> right now is called machine learning and the instance that we're going to have with artificial --> intelligence is going to be so short because no one's thinking about uh post-quantum cryptography --> right like ai can conceptualize things based on how humans provided to them --> quantum computing there it's instant it instantly can work around anything right so it's like a --> where ai is still conceptualizing quantum computing has already put it inside a box --> does it make sense so it's like i always try to i try to communicate to people like hey --> when we think of when we think of artificial intelligence we're thinking about something --> conceptualizing information out of thin air right versus when it talks about chat gtp and a lot of --> these language models that are being provided to us or presented to us they're all referencing a --> library right they have to be trained on the data set but if it had to be trained on the data set --> then well it's this is machine learning you couldn't give you couldn't give chet gtp a thing --> that a human's never done before and it still accomplishes does that make sense it's because it --> can only work off the library we provided it quantum computing is everyday creating things --> we've never known existed so like google owns a quantum computer that's made time crystals --> for example it's an energy and a stone that it a stone that --> emits so much energy that it never loses energy you know and i said it to say --> uh cyber security professional is always going to outsmart chat gtp that sounds crazy right so --> all the information that's being compiled to you for chat at best it can only give you 90 --> because it's never had the opportunity to deal with human interaction which is what the information --> is based on so it's one thing to touch the stove and know it's hot it's something completely to --> touch it's something completely to be in the in the in the in the in the space or the transition of --> touching the stove and you remember that it's warm right ai will never know that the stove gets --> that you can feel the warmth before you touch it it can only rely on you telling it that there's --> warmth there and they both create different spaces or different responses make sense --> so it's like um just a small part about the gtp part so like let's say in your day-to-day right --> chet gtp couldn't help you with are you a taylor swift fan i said are you a taylor swift fan --> okay so imagine if boom taylor you you online and you realize taylor just released a surprise sale --> but you have to find the site that it's on right you could use this site command or you could use --> the file type or you could use the indicator to specifically look for taylor swift in the last --> hour right and wherever that site is for those tickets it's going to have the most traffic --> wanted and now you found the tickets so that would be a way you could use open source intelligence --> in your day-to-day life right another way could be let's say you go to target and you're looking --> for a specific product and it doesn't happen to be there right open source intelligence would --> be you going into your google typing that product in and then looking at all the other stores that --> also have that product so now that we've gone through google dorks you what would be uh --> said that one time so you could type for example you go to site and you can type in twitter or you --> can type in x uh-uh yep uh so you can put x.com you could and then uh so remember you got to use --> your quote so if you so you so the reason why i didn't do it for that one is because we had already --> indicated so for example when you write it this way it doesn't have a site right when you --> you write it this way it doesn't have a site because it doesn't have anything to close it --> out right but if you write it so now see how it comes up this way and now you have every 12 --> tell us with either account or post on x so all that to say is it's just a it's just a great --> research tool right that would allow you to go and pull any pertinent information that you might --> be looking for so like say for example and then you can go file type it's going to pull every pdf --> off of x that exists for example i think there's a lawsuit here i don't think we should be able to --> see that oh no it's a transparency center so it's fine but they say that like for example like --> just another witness you could use information um another thing that you will find in doing this --> also is you'll get a site link and you can take it and you can put it in showdown and it'll give --> you like the ip addresses and more information as well so if it's possible could you show me the sub --> domain tree using dns dumpster for starbucks you remember how to do that so you should still be --> able to yep so so if you scroll down see how we have that tree right there that domain tree --> it's showing us the layout so now what you want to do is you want to go back to your who is search --> and you just want to pull the second one to do the exact same thing because see how they're --> see how they're different so you want to be able to do you want to do all of your name servers just --> because you want to you want to be able to get all the information that you can you might have --> to refresh that so now see how those records are different they're similar but they're different --> so that's how you would go about your subdomain trees right you want to go about getting you want --> to go through and you look at all your domains because each one is going to have a different --> layout because it essentially is a different directory right so you may have one dns that --> only deals with your clients and your customers you might have another dns that only deals with --> employees and business you might have another dns that only deals with financials and so on and so --> forth right so you always want to go through and gain access to your dn your subdomain trees --> because they're going to give you the layout of the site how do you feel about it now you feel --> do you feel confident in the steps so these tools form the foundation of what we know as real world --> osin operations right they're both proactive and they're post-breach right they don't none of these --> things that we're doing only exist in one format they're consistently walked through and gained --> access and learned about each step of the way we don't just use these tools like we train them in --> the methodological workflows that stand up in courtrooms dashboards and congressional hearings --> right so let's say in my time frame of doing cyber security i've been fortunate enough to --> help portion 500 companies i've been fortunate enough to sit in in congressional hearings --> i've been able to help companies determine who how they move and what they gain in their mergers and --> acquisition it's just become a part of day-to-day life and business so now this is where the turning --> point in training comes in right so we move from all of the theory and the tools that you see to --> like full spectrum investigation right so now we're going to walk through real world scenarios --> where osin serves as both like an early detection mechanism as well as like a post breach evidentiary --> tool right so our goal in this point right here is to emulate how a division a digital --> forensics analyst or a breach response team will conduct an initial triage of a suspicious domain --> right as well as like identify related infrastructures erin escalate if it's needed --> so in this scenario that we're going to use right we're going the the internal cyber security team --> at a mid-sized healthcare provider receives multiple reports of phishing emails impersonating --> their billing department so the domain in question is i'm going to give you a domain --> right now your job is going to be to investigate this domain using open source techniques to --> determine whether the domain is malicious or suspicious how it was registered and configured --> what infrastructure is connected to and whether it can be leaked to other known threat actors --> like you can do that we can do it together all right so we're going to do www.cigna.com so --> signa is our client right and they've received multiple reports of phishing emails impersonating --> their billing department right so now what we got to do is we have to determine if this domain is --> malicious or suspicious how it was registered and configured what infrastructure is connected to --> and whether it could be linked to other known threat actors right so the first thing we want --> to do is we want to take signa.com and we want to put that in our who is registry --> now you may have to do that part and i'll watch you because who is doesn't want me to be great --> right now that's signal that was working for me and not working for you okay all right perfect --> all right so now we know that our questions are whether the domain is malicious or suspicious --> the next one is how is it registered and configured --> what infrastructure it connects to and whether it can be linked to other known threat actors --> right so based on those four questions is there anything based on the information we're looking at --> right now that these answer is answered so from this who is page we're able to see how it was --> registered when it was registered and configured right so we're able to see the name servers which --> is what you went to we get to see the data it was registered on i'm sorry so we're able to see that --> that the data that's being provided which is the name servers the register on and expires on --> gives us the configuration for the site so from there you want to know what infrastructure is --> connected to right so what's our next step with the name service what do we do with those so now we --> have the structure is connected to right how is infrastructure how it's structured right --> so now what's our next step yep so see how now it's time to look for another further investigation --> so it's inconclusive whether it could be linked to other known threat actors --> so we can't tell them if it's malicious or suspicious based on the information provided to us --> so if you look in our investigation based on the information we just pulled up we were able to see --> that the registrant isn't using they aren't using privacy protection because we can see --> specifically that the name of the registrar is their actual company right we can see when the --> domain was registered and we can see what country it's located in right so so that's the information --> that we would need in in a open source intelligence search right so let's say for example we're going --> to say that this domain was registered less than a week ago right so we also look and hear about our --> client having issues with the phishing scams and then we found out that this domain was registered --> around the same time that the phishing scam started right so the who is privacy wasn't --> wouldn't essentially be inherently malicious right but it is a reason to increase suspicion --> but does that make sense so if you think about it like this right imagine you own cigna right and --> then someone comes up with a site that's named cigna as well too but instead of with a c it's a s --> right in this process you would want there to be a distinction between how both of these are --> identified but let's say if you had a privacy issue right where you couldn't see who owned --> either one of them they just happen to be under the same server or the host right then the end --> result will be inconclusive because you don't have enough data to move forward does that make sense --> so from there right going to the dns dumpster subdomain and then the dns directory right we're --> able to see like subdomains right so let's say for example if we looked further into --> signa what we would find is like a login for signa right for them to have their employees log in --> we would also find an email space where the emails were housed right but we'd also stand --> we would also look at the mx records deeper and what it would show us is say for example --> zoho.com zoho.com is a email a email marketing company that essentially takes care of the --> companies campaigns right so when you have the present of a log and the email subdomain --> it usually is going to suggest some form of efficient intent right because they mimic --> real services does that make sense so the client would provide us a copy of their directory and --> then we would gain a copy of the fraudulent information off of the website it's being --> presented on and then we will cross-reference the two so let's say the client website would --> have to have more than just email login it would have to have a space for records you have the --> space for uh your your privacy notes everything versus the person with the scamming site would --> only need the things showing the information that they were trying to gain access to so like using --> the third party email provider might be an invasion tax an invasion tactic tactic excuse --> me or an effort to appear legitimate for some people so we can go we can continue to go in i got --> i have shodan mall ego left right and then after that we'll take like maybe a 15 20 minute break --> that work for you all right so even though we haven't been able to use shodan shodan would --> allow us to search for any exposed server and iot endpoints tied to that domain's ip range --> right so your findings could be anything from an apache web server running on port 80 with no --> https right so that's uh that's a server that's giving internet access but it's not having any --> secured authorized it's not giving any it's not using any security protocols for a person to have --> access right another one may be what's called waf right that's like your wi-fi access security right --> no security shown for your wi-fi available right being able to organically see where a server is --> located and even being able to find like known vulnerabilities right these are things that show --> up when you access showdown so like for example let's say a person hosting a login form over http --> in 25 2025 right with unpatched server software like that concerns like a suspicious infrastructure --> because of how we move and navigate technology today and then once you gather that final piece --> once you gather that final piece of information out of out of shodan the final thing would be using --> more ego to create your final like relationship map so once you aggregate and visualize all your --> findings using more ego right it's just going to give you a holistic view of how the domain connects --> to the broader infrastructure right so it's going to show you how everything worked within that and --> that within that service access and how it interacts interacts with the internet right so --> you'll input the domain you're going to do what's called resolving the ip address and track the --> shared hosting environments so you're going to see everywhere the ip address went everything --> is engaged with how it's identified in the source right then you're going to go through and verify --> the the integrity of that information because it's going to have certificates attached to it and that --> certificate is going to be able to tell you any information that you can cross-reference right so --> certificates tell you what that thing that's interacting with your system will be doing --> it'll tell you what it's not supposed to be doing but it'll also give you an email and --> contact information for the person that it belongs to now if you come across a certificate that --> doesn't have the information that connects you back to the person then the end result is --> it's usually going to be something fraudulent does that make sense so the end result is usually --> going to see like the related domains on that same server it's going to show you all the different --> domains on it it's going to show you all of the certs shared by that same company it's going to --> show you all the emails were used across that same space and then it's going to end up revealing to --> you that the domain is not only is it not isolated but it's part of a wider fishing infrastructure --> cluster right it's very rarely going to just be one space is usually going to be something that's --> going to bloom into this very big ugly flower so it's possibly operated by a single threat or --> multiple people right so at this point now we've kind of gone through every space that would --> actively be navigated for open source intelligence we've gone through the type of information you --> will find we've gone through deciphering through what information is needed and what information --> isn't and we've also walked through verbally communicating a a process and a map and the idea --> of getting to the outcome we're looking for right so if you want we can take a short let's --> say 27 minute break we come back at 12 30 and we'll start back at another lab that sound good to --> you all right see you at 12 30. is there anything you would like for me to leave up for you to see --> while while we're at break also if you would like if you come back before then and maybe --> want to look at certain things you will notice on your page that there is a student manual --> right and the student manual goes through the student manual goes through --> everything we're doing now so if it's anything you feel like you may want to look back at a rush --> brush over if it's anything you felt like might have been out of wonk anything you feel like you --> might you know might want me to like maybe refresh or whatever the case might be these are gonna be --> your go-to notes every site that we've gone to thus far should be at the back in your appendix --> so at any point in time going further you might feel like you're falling behind --> you got your go-to because from here everything is going to be hands-on i'm going to need you to --> show me walking through some of these steps sounds like a plan awesome i'll see you at 12 30. you --> ready so for the second half we're going to get more hands-on so do you have a online --> text editor you like to use or somewhere you like to take notes --> all right so if you can you can x out of that we're gonna start a whole new instance of --> firefox let me get ready all right so we're going to start this next uh going more in depth --> into about and more in depth with google dorking right so we're going to get a more understanding --> for what google how google dorking is used like tactically and osin techniques we're going to use --> advanced google operators to discover vulnerable assets and public intel we're going to apply google --> dorking legally ethically and for forensic readiness we're going to examine some real --> world study cases from breach investigations we're going to integrate dorking into red team --> legal discovery and cultural threat models all right so did it one more time yeah so we should --> be at 12 at slide 12. hands-on practice tools in action uh so go to your oh you want to the slides --> we're ready to we're already going to google dorking so google dorking is the use of advanced --> to search operators to uncover hidden or sensitive information exposed to the public web --> most of the times unintentionally right so more times are not you're not hacking systems you're --> just hacking exposure right so dorking matters forensics and intelligence because it shows us --> the breach visibility and attribution it shows us the pretext and impersonation vectors --> is showing us the evidence discovery for court admissible and chain of custody right so when --> you're using when we're dorking right it allows us to see real time in the active manner how a person --> is using the thing right we're able to put this information into google and it's going to show us --> a date and a time stamp it's going to show us where that person was located it's going to give --> that person a specific indicator in the world right from there it's going to give us pre-texting --> and impersonation vectors right so we're going to know what ways this person fraudulently used --> other spaces to gain access to things they weren't allowed to right and then we use --> that evidence of discovery that we just received and the way that we received it to become quite --> admissible chain of custody right so that's when you compile your information are able to write up --> a report that's going to be used as an expert witness statement or something that i have you --> testifying or the sorts right so what we're going to do now is we're going to start going over --> the common search operators used on google right for google dorking but this is where your notes --> are going to come in at but also at the end of your appendix there's a link to the google dorking --> information that i'm providing so if you don't get it here and don't want to write it down that's --> perfectly fine so one that we've already worked with is site right so site restricts you to a --> domain right so we do site and we do google.com then everything is going to show us is going to --> reference google.com right so the next one is file type right so we're going to do file type --> yep pdf so see how pdf brought us up different responses you may have to put a --> so you may have to put a space yeah all right so now you can see that file type brings up all pdfs --> right so the next one that you can use is called in url right in url --> so this helps you discover paths right so you can say admin login is it sticking with you --> so in title right so this is going to be anything that's going to be in that page title right so --> let's say for example i think i showed you this one earlier index index uh so index of is going --> show you like i said before i mean the index of is basically what starts the beginning of like a --> page which usually has pertinent or important information in it right so you have an index of --> passwords index of records index of whatever that thing was that directory is housing right so it's --> another avenue of being able to go through and see structures right another one that you can use is --> called cache right so you can cache and then say for example google.com i kind of knew that was --> but now you can see like all of the history and the cache of google right and that information --> is important because like i said before it allows you to see metadata and it allows you to see old --> pages right so sometimes you can take this information and have you ever heard of the --> internet archive so there's the some people call it the way back machine right but yep but you'll --> notice that when you click on it becomes the internet archive so let's say for example if --> you type google in there www.google.com what'll happen is you will see a library for all of --> google's history and at the top you'll notice that that lit that that that history for google --> goes all the way back to beyond 2002. so for example if you go at the top go to the top and --> you type in myspace.com right you will see it actually go the opposite direction myspace --> uh-huh dot com and i see how like there's that that same gap so like say if you click on 2025 --> and now it shows you the map at the bottom now if you click on any of those random bits --> now it shows you all of the internet activity for it so you have for example like you have ext --> right oh so you can do like ext for extension and that'll show you it's just really just an --> alternate to file type so then let's say for example can you see my screen so let's say we --> we go extension excel then we do budget right so then it pulls up every excel it brings up every --> space for the excel sheet so the next one is in text and in text will give you any body keyword --> searches right so let's say for example restricted use only so now you see that it's --> showing us every every document that has restricted use only inside of the text of the body and another --> one would be linked which gives you just pages linking to a url so you can do link starbucks.com --> then they'll show you everything that's linked to starbucks url all right so now we're going to go --> into your offensive and your defensive applications right so red teaming is your --> offensive engagement right red team is always going to be attacking blue team is going to be --> your defensive applications that's going to be the thing that's always trying to protect and defend --> then you're going to have instances where it's called considered purple teaming right that's --> when red and blue teams are working together to be able to get a complete model and the integrity of --> how it's performed does that make sense all right so the red team will use osin right for pre-engagement --> intelligence right so in intelligence you have what's considered the active recon and you have --> passive recon so uh google dorking helps you google dorking and osin helps you do what's called --> consider pre-engaged intelligence so you can harvest subdomains before you even have to use --> in map because in map requires you to actively engage with the system right so actively engaging --> means that you have to send a signal directly to it and it sends a signal back to you right --> whether it be a three-hand way handshake or the sorts or you're looking at another --> at other entity that's holding this information as if it was on the site --> so finding documents with metadata whether that be author names or usernames sometimes you get to --> find exposed git repos right so have you ever used github or heard of github have you ever heard of --> github so github is a repository that developers use and they get to store like their code there --> right so sometimes you'll have a company who will store all of their keys for their company --> on github and then you'll have a leak that will show these public that will show these keys to --> the public but another way that you could use use osin and dorkin is like for unprotected login --> panels right so how i showed you earlier that putting in that in url colon admins forward slash --> login and then putting the site's company or the site's website would produce a login that may not --> supposedly that is not supposedly it's not supposed to be visible to the average person --> so like a real world example is like let's say that the target is like a mid-sized healthcare --> provider right the type of dorking you could use would be file type xls and then the site's --> website's name right and then more times a night doing that will show you all of the unprotected --> spreadsheets within that internal email list and then you could use that as efficient emulation --> right that would ideally maybe 68 clicker right so you would use a piece of information publicly --> available to be able to gain access to their email list only to send them phishing emails --> so now that we've covered the the red team inside of we won't cover the blue team side of it which --> would be the defensive approach to it so in blue teaming you want to be able to pre-audit google --> darken like as a digital hygiene so whenever you're going in to work with a client or you're --> going in to engage with a scenario you act you ideally want to have a base format that you're --> looking from and that's usually going to be in your google dorking it's going to show you what --> things are actively communicating with the internet it's going to show you what things are --> accessible with the internet and it's going to show you what all your weaknesses are to the internet --> so you're going to use dorks to emulate attacker recon recon recon right so the same way that we --> use recon on the penetration testing side blue team side uses dorking to simulate that on their --> side when trying to patch holes right so it also searches for outdated cache pages and sensitive --> directories right so when you're building a page it should realistically have a life cycle before --> there's an advancement to a new one and a lot of times when people tend to allow their pages to --> become outdated you end up releasing sensitive information that you didn't know right so --> usually you want to come back google dorking with shodan there's another tool called hunter.io --> that i'll show you very quick so you also use hunter.io which right which is just like shodan --> it's just a different set of data sets as well as you can use spiderfoot so spiderfoot --> i believe has an online presence but it's also a tool used and a tool used in cali where you see --> spider foot so you would you would essentially you would integrate a usable combine these tools --> to be able to create a full threat map of what a scenario would look like so another way that you --> do it is you could use automated python scripts and just the apis from google to be able to do it --> for you as well so now we're going to go into lab dorkin scenarios right so our objective is going --> to say i'm going to say objective is to discover a personal exposure for c level executive right so --> let's say for example in this scenario that we're looking at right now and trying to find someone --> let's say for example i would do in text go to google first --> um can we go in text confidential then you can go file type pdf then you can go site --> google.com so now what you'll see is all of these pdfs here they're supposed to be behind --> paywalls because they all exist within the google drives outside of this one so let's say from here --> let's say it gave us a any of these websites for example we're going to go to so we just go to a --> random one right let's say it has images here right we would run we could save the image and then we --> We could go to Google, and then you could essentially run a scan on the search, on the --> photograph. --> You can take the photo, drop it into Google, and now you see every place that this thing --> shows up as well. --> So let's say, for example, with this photo, you could do an even further analysis to it --> and extract all the EXIF or EXIF data from it, right? --> then using pdf i pdf info or you can use exit tool right and then from there that's where you --> will begin like your osin chain of evidence note you would tell them the time stamp of what it was --> you found you would tell them the author you would tell them the revision history because like that's --> your forensic bread breadcrumbs right so you always will start with your first revision as you --> grow and learn more data i mean more information you're going to revise after each one sometimes --> you can have hundreds of revisions. Sometimes you can have no revisions. It just all depends --> on the scenario. You still with me? You sure? If this is getting boring, let me know. We --> can switch it up a little bit. You good? So in the next scenario, we're going to talk --> about healthcare, what's considered PII risk, right? So you always want to be able to know --> like when you're dealing with hipaa and things like that like your different type of --> security risk as well as vulnerabilities and google dorkins i was going to be your first way --> to get to that and open source intelligence right so we can go back to the thing we did last time --> where we use cigna you can say site cigna.com and then you can say file type pdf and then you --> can say in text ssn right social security numbers so now you have every pdf on the signia directory --> under the signature directory that references a social security number so like we would look --> at this i would unfortunately have to go through each one of these these links right here and then --> i would create what's called a uh internal exposure score right and that score would determine what --> how much vulnerable information is being exposed to the rest of the world and how should we go about --> addressing the situation so like i said you would generate your internal exposure score --> and then you would alert them via the compliance playbook right so that would be integrated with --> your sock two artists so every every security center is supposed to have a system or playbook --> set up for how they respond to instances where there would be a breach on intrusions and that --> would be under your sock to audit so now that we've covered those two scenarios and i presented --> you with a lot of different information we're going to have to go through like what's like --> ethics laws of forensic change right so first way to start is like what the first thing to start --> with is what's legal right so querying google isn't legal illegal right and finding public --> files is not illegal but if you download and you manipulate these confidential --> files now that can be illegal right based on the type of information it is and exploiting --> and exposing like exploiting exposed credentials that also violates everything you could ever do in --> cyber security which i'm pretty sure you know that so at the moment that you find evidence --> while you're google dorking or using open source intelligence the first thing you want to do is you --> want to timestamp your query then the next thing you want to do is you want to screenshot your --> results and then you want to archive with the wayback machine right and the way that you're --> going to archive with the wayback machine is you're going to go into the wayback machine and you're --> going to put in that address and it's going to give you a certain amount of information on it --> that you're going to store in your report right you're also going to use what's considered a --> hash verification for collected information or collected files so what happens is when you create --> a file or you encrypt the file it's creating a public key and a private key and then there's a --> key a decryption key that's used to translate the information right so your public key would always --> remain your public key would always be front-facing it all be thing that is being accepted for the --> information and your private key will always be personal with you so what happens is when you pass --> information between one another you pass what's considered a hash and then that hash is what's --> what's analyzed and produced as the key that's necessary or invalid and then from there you --> would log every step in your analysis notes so it's like another thing that you have to keep in --> mind is you always want to document and you want to engage with caution right because osin if --> documented properly can be used it can be admissible as as quite evidence so for an example --> outside of my business in cyber security i also show support and helping people through upwork --> right and a lot of times what happens is a person will come to me because they've looked --> for cyber security professionals and they they can't afford the services so the end result is me --> giving them a walkthrough of how to get the information that they need and then when they --> they present it to me i do my own analysis and then i present them with the expert witness statement --> right and more times than not this statement is used to help at least put the put the law enforcement --> or the or the cyber security professionals on their end in the right direction --> makes sense so when it comes to the legal and ethics and the law side of it --> we also have to get into the country the cultural context about it and the geopolitical targeting --> that comes with it right so like you have regional threat modeling via via dorking right so when you --> think about the fact that there could easily be a public sector agency in --> palestine right now right or israel right now or iran right now that's essentially attempting --> to get services for people in need in their countries and there could actively be uh act --> it could actively be google dorking campaigns stopping them from getting the resources that --> they need right so in that process you're going to look for the procurement strategy documents --> and the personal information on the field offices right so it's like a let's start with red team --> first let's say red team side public sector agency and we're going to start with israel right let's --> say we're going to say palestine right the the way that you will google dork that is you will --> look at the site you will look at the file type of documents and then you look for anything --> confidential right so when we say look at sites you would go to let's say we're just going to --> throw a generic one out there www.palestine.com right and you would look for confidential information --> and within the documents and the thing you would be searching for would be the field offices --> does that make sense okay so like in politically sensitive regions exposure does not equal oversight --> most times is life-threatening so one of my recent one of my recent contracts was with a --> journalist who was overseas and their their identity was compromised in a certain type --> of way and they needed to be able to gain access to get their information out of the country without --> being targeted so the end result was me having to build a complete system for them outside of their --> access and then giving them the credentials to access it long enough to be able to get their --> information up and then taking it offline before they could kind of follow it back based on the --> route it came so could you see how that could be a important part of open source intelligence --> so a lot of times in ways in which google dorkin and open source intelligence has been used recently --> as like identity and like identity and identification right so we've had times where --> dorkin has self-exposed implicit biases and systems right so we've used osent to be able to identify --> gendered names using name conventions right we've been able to use dorking to gain access --> understanding for religious identifies and school admin files we've been able to use google dorking --> to get length specific file leaks right so there's there's so many different use cases and scenarios --> in which we would be engaging with open source intelligence so the next one i'm going to talk --> about is ai automation in the future right we're going to talk about how --> with dorking and llms you get end up getting you end up getting the outcome of open source --> intelligence at a scale right so when you can die when you combine the dork file dumps with --> llm summarization right so we're talking about auto extracting from hundreds of pdfs right we're --> talking about being able to use llms to generate dork permutations for a target domain so you're --> talking about having an llm that can literally generate all of the all the metadata and all the --> information needed to target someone right and we're also talking about using ai system metadata --> enrichment right so you're talking about taking the metadata that's presented to you and being --> able to take it to the next level with ai and being able to see who wrote it when they wrote it and from --> what machine right so one of the so we say google dorking there's a whole lot of different ways that --> you can think about its uses right so you have what's considered the you have what's considered --> the google dork google hacking database i'm sorry we're back to the exploit space and now you get to --> see all of the all of the exploits found based on google dorkman so now you can see the person --> went to site dot site github.com and then they looked for begin open ssh private key right so --> let's just say for example we do the same thing they just did all right so we take --> site github.com let's say we type in password password list it's literally just showing us --> all of the password lists that have been left on upwork i mean i said upwork on github --> so based on what this is communicating right here this would be true --> did you see why so yeah so you could see that this is true based on us adding github.com --> and rather than us using private key we just use passwordless which would be essentially --> something similar and we've seen the outcome that we had but if we also went and decided to use now --> we see that we find that we have people's private keys to see how for example this one here says my --> ssh private key so we shouldn't be able to see that because that key gives us access to people's --> information right so your google dorking your google hacking dorking base uh database i'm sorry --> is what you're going to use when you're basically trying to figure out what things exist that you can --> use to get the information you're looking for so see how you have your categories here --> they'll tell you about files containing passwords --> vulnerable servers juicy info --> and it's a great thing for cyber security professionals to have access to --> the downside to that also is that bad actors also have access to it as well --> so there's probably someone who wakes up in the morning comes here looks at this and --> then scours the internet for as many passwords as they can before a person --> gets to patch it just keep in mind that one we just looked at that was from august of 2024 --> and it still exists so imagine a breach being 10 months so no one knows it exists and people --> just can come here every single day and engage with it you also have you have this one still --> get dorker so get dorker is another script that's you or application that's used for osent and --> google dorking and what it does is it uses the search api from github and it just scours it --> looking for any piece of it scours github that repository that we're in now so for example we're --> in github right now right and you we're using an application that someone created for free --> right and it'll show you who supported it like who had actually written code on it who's created --> tickets who's been debugging things like that you can see how many people have also --> also using it you can see how many people have you see how many people are watching it to see --> when it's updated and it changes you can see how many times people have started you can see it's --> readme and activity right so you go to the readme it just tells you about it tells you how to use it --> and the source if you want the moment to look at it just perfectly fine it's not a issue --> so with google would get darker right it's it's essentially just using command line to give you --> an interface that scrapes the back side of github so if you're looking for passwords you it'll --> provide it to you if you're looking for confidential information you'll put those keywords in there --> or those indicators in there and it will present it to you if you were looking for a specific type --> of project it would do all of it it would do all of the hard work for you so imagine everything --> that we're doing with google and then putting the putting the different indicators in it would not --> only do that it does that and it compiles and stores it for you as well so you have dork scanner --> another great resource to use is geeks for geeks so it tells you about dork scanner right usually --> what geeks geeks for geeks does is it'll tell you about the thing it'll tell you how to install it --> it'll tell you how to interact with it how to address it and the sorts the dork scanner one --> on the github the second one the second one oh no no no no the fourth one fourth one fourth one --> i'm sorry but it just gives you a complete walkthrough and snapshots to be able to follow --> along only thing you would do is change the information that you're looking for for the --> information that's being shown in the example you should get the same outcome just with your --> own personal information another tool that you could use again we talked about was spiderfoot --> well there's a tool that's supposed to show up here named spiderfoot is supposed to allow you to --> essentially be able to like use the internet as a scraper as well i was hoping it would have --> gave me a screenshot so in geeks for geeks i went to the spiderfoot and was able to pull up --> up that same tool for you again if you want you can go to the search box at the top right --> of that page on geeks for geeks and you can type in spider foot it should be i sign up --> go to geeks for geeks yeah i make geeks for geeks and then type in spider foot the first --> green one the first green one this one here the second one i'll hide that uh-huh yep so --> if you scroll down you'll see again the installation and the setup and then --> you'll see where where you started and it begins and then it starts to do with --> scans an example like shows you how it scans the website so you hi because that's --> the scan name you should also be able to see our ego here as well so let me --> talk about month ego before body being like forensics application that allows --> you to map out the entire space of your or allow you to provide visualization for all the data that --> you get your compiler then your air result is your map so again your network servers on the --> network email so now that completes the breakdown for google dorking do you have any questions --> about dorking and google dorking you sure so the next tool we're going to work with is who is --> so as we've already discussed before with who is it allows us to identify ownership and domain --> intelligence right so when talking about who is we're going to have the better the objective is --> to understand the who is data and why it's important to osent we're going to use the who --> is lookup and other tools information to show how you create domain infrastructure analysis --> we're going to identify registration patterns historical data and network associations --> we're going to be able to implement who is information and breach investigations --> threat attribution and red team operations and we're going to apply ethical standards and --> legal guidelines when using who is data right so all right so basically who is a query and --> response protocol used to query domain name registries it's provided it provides details --> information about domain registration ownership administration contacts and more so basically the --> data just helps you see who's behind the curtain whether you're tracing a hacker's infrastructure --> or you're investigating a breach so it's used for red team man blue teaming right so we're going to --> go back to keep using google so see where it says enter domain or ip uh whether yep you can type --> www.google.com okay all right perfect so as you can see we got our we got our domain or our registrar --> right i mean i just registrar right and that's the entity that's responsible for the domain --> registration so that's going to be the person who signed up for the domain that's going to be the --> person that is going to be essentially legally responsible for anything that happens with this --> server or this space you're going to have your your domain or your registrants name --> you will have the person that's your contact you're going to have any technical contacts that --> you reach out about information and you're going to get any other related domain names to it so --> also in that you will find like your creation your expiration dates as we've seen at the top --> and then you'll also get your name servers and your ip address locations right so the information --> that you can get on who is is either going to be public or it's going to be private the public --> information is going to be available to everyone and it's usually going to be on like the registry --> or the lookup site itself or if you come into a private one it's going to have small it's going to --> have limited visibility right so you may run into a domain that's going to be registered under --> godaddy.com right or you may see one that's registered under northwest registered registered --> agent.com right because these entities block the visual identity of who their owners are --> right so there's multiple different tools you can use for who is lookups right so one of them --> that we've already used earlier was ican and we used ican to be able to find the same information --> we were looking for another one is whoisdomaintools.com you type in www.google.com --> here so you notice how your who is still provides you the same information such as your registrar --> your registrant's name your organization admin tech contacts creation expiration dates name --> servers ip addresses right so sometimes you're going to use more than one who is tool because --> you're going to get different information from each one you also have maybe that one's only --> available to me you have view dns so when you go to view dns you see all the tools that come --> up that can be used in open source intelligence right so on your view dns you can go to your --> reverse who is lookup reverse ip lookup ip history and the source another tool you can use is domain --> big data it shows you the different packages they have for all of the databases of who is --> depending on what sector you want what industry you want what information you want it gives you --> everything you need you come here and make a email list you could come here and make --> essentially a lot of things some good and some bad then you got who is xml and who is xml just --> give it to you in the xml layout on the map so i want you to go to the i can who is site --> i'm going to give you a domain name and i want you to be able to tell me who the registrar is --> like who like who is it where was the main where was the domain registered i want you to tell me --> the registrant which is who is the owner i want you to tell me the name servers which is where the --> domain is hosted and i want you to tell me the creation dates yeah you can do that hello testing --> can you hear me okay so are you going to be able to are you able to walk through who is --> can give me that information all right so let's go to who is i can't see where you're clicking --> there's okay all right so i'm going to give you www.intelligent securities --> group.com so this would be an example of of a site that would be protected so this is my website --> so if you notice it won't it'll give you the domain name but he asked who i am it tells you the --> it tells you wix but if you go to status it'll tell you client transfer prohibited --> and client update is prohibited so that's a space for an example just where it would be blocked at --> So another one to go to would be www.target.com. --> Target, T-A-R-G-E-T, yep, .com. --> So if you can, let me know the registrar information. --> Let me know the registrant, the name, servers, and the creation date. --> Yep. --> Can you tell me who the owner of those servers are? --> So see how it says GoDaddy? --> so that means it's another protected site so normally in in today's current age when you go --> to try to see the registering for a site is this is what you're going to run into when it comes to --> corporations and businesses and things like that so this right here is always going to be your --> first step in understanding how a domain fits into the broader cybers ecosystem basically right so --> this is going to give you your first circle on your map right that map that that circle on that map --> is now going to have four key points right it's going to have your name servers that's going to --> be a point and it's going to be four lines off of that you're going to have your location that's --> going to be a point there's going to be names off of that you're going to have your registrant --> effort like your creation date that's going to be an indicator and various other things --> right so let's say for example you can use your creation date based on what you found and who is --> for example right you could take the creation date and you could put the you can also take the --> company name with using google dorking and you could essentially be able to find a --> essentially a map of that person's exact entire existence or that items or that device's entire --> existence so more times than not you'll see that the registrar is go daddy it'll say the corporation --> to be xyz corporation the creation data is 2015. so like who is isn't always i'm saying that to say --> who isn't always going to be a smoking gun but it's often going to be the first clue in your --> cyber investigation right it you're never going to get the same information twice for different --> companies so you grow comfortable in seeing the not available you grow comfortable and seeing the --> now have not having access because that's what it's supposed to do it's kind of more of a concern --> when you can do certain things if that makes sense so now we're going to talk about using who --> is for attribution and infrastructure mapping right so i talked to you earlier about attribution --> and that's the complete that's the complete scope of a thing right so that's being able to tell the --> network it used that's being able to tell the device that was used and the individual who --> belongs who had to believe that device belongs to right so in the same instance you would use --> that for attribution in regards to like tracing malicious actors right so something happened on --> the network you now have the device mac address or you have an ip attached to it and now you know --> the person who purchased it right even if it wasn't that person who did it they would know who the --> person was that had access to said device and then you would be able to begin to do your investigation --> so for example like in this situation you could use domain big data right which is going to give --> you the historical who is data because sometimes you're going to want to know the entire history of --> the thing that you're searching behind versus just a snapshot right and you'll be able to use --> this information to identify domain or internship history as we talked about before especially in --> cases where like domain hopping or infrastructure obfuscation comes in right so you have issues --> where let's say a person owns a company but rather than being on top of paying the bill every --> year annually for somewhere elapses and now they restarted with a completely different domain name --> right this would be considered domain hopper right you would also have what's considered --> infrastructure obfuscation when you would have a person with a certain domain name and it would be --> linked to a completely different domain name right so there's no reason why www.red.com --> would be having to reference www.blue.com does that make sense okay so let's say for an example --> a cyber criminal uses a domain to host a phishing site right and that domain was registered john doe --> but it was later changed to anonymous llc right like this history can help investigators identify --> when the site was likely used and trace it back to prior owners right so let's say when it was --> used as john doe it was 2015 to 2024 but in 2025 it became anonymous llc now if we're looking for --> a intrusion from 2015 to 2024 then it would have to be under john doe correct but if it was --> post 2024 then it would more likely lead to anonymous llc right so back examining this --> who is history we can track the life cycle of a malicious domain and that can lead us back to the --> act of the hand attack so there are times when you're going to pack you're going to partner who --> is and dns records and that's going to give you what's called a network map right so that's like --> when we talk about like that name server analysis so like that who is look of it's more intense --> not going to give you the name server of the domain but you tracing that domain name servers --> ip address and domain history is going to allow you to discover the related sites and infrastructure --> So you remember how we started out with who is, oh, how we started out with the name of a site, and then we went to who is, and then who is gave us the name server, and then the name server broke us down. --> We were able to take the name server, put that into DNS dumpster. --> You remember that part? --> So using those skills is how you can discover those related sites and infrastructure, right? --> So that's how we were able to find that. --> Remember, we used the name server, and then we were able to identify the other domains. --> remember we were able to use the fan that ip address that we put in the app and put in white --> noise and we were able to see the mx records and things like that where is that this one --> so remember when we went to gray noise and then when we looked at the ip address it was able to --> show us the basically the entire layout and it basically it can reveal dozens of other domains --> related to a malicious infrastructure right so it's not just the ip address that you're presented --> here but also the spoofed ip address the the origins of the spoofed ip address and things of --> the source right so when you find like say for them when you go to your first who is look up and --> you find those dozens of domains like that can be the jump off point for your deeper investigation --> right let's say if a client tells you in their scope that they only have three domain or they --> only have let's say three domains but now you find five or eight now you know that there's something --> wrong right so another reason why you would use who is for what's considered digital footprint --> analysis right your who your who is data can expose linked domains through shared registrants --> emails and contact info so the gift and the curse of the having the hidden domain registry --> is that now you've given a hacker or an intruder --> the opportunity to knock at one space --> versus, let's say, the tens of thousands of spaces --> that they hold access to. --> Does that make sense? --> So that being all stored --> and being at the mercy of someone else's security, --> someone else's availability, integrity, confidentiality --> is why we use who is for digital footprint analysis, right? --> so an example would be like let's say that domain a is hacked for hacking financials right but the --> registering email for that is contact at financialhelp.com right that was that would be a --> mismatch does that make sense so the domain name is hacked financials.com so you would assume that --> the email will be contact at hackedfinancials.com so the fact that they have mismatched domains --> would mean that there's something fraudulent going on in this situation --> does that make sense so like your network imagine your network is your home --> right you wouldn't you wouldn't put your kitchen outside your home --> right because it's within your within your living space so it's the same thing with your domains --> your domains would be within your network you would never want your domain to be housed outside --> of it because then you it becomes a leak for information so in that situation for example right --> where you would see this information and who is where you would have the mismatched domains you --> would look up both of the domains that share that same registry email right to identify that --> additional sites that's attached to the attacker does that make sense okay so in part four we're --> going to talk about legal and ethical considerations when using who is right so the legalities of using --> who is data is that is who is data is generally public right but it's also it also is required --> to be used ethically so it just being online isn't enough for you to say it's available to you --> you still have you're still required to be in compliance right so you always have to ensure --> that your research aligns with whatever uh whatever governing or regulatory and privacy laws --> for the state or the industry that you're working in right you can't scan these things and you can't --> spam something about it right so you can't use that who is data to like flood contacts with --> i slid it's unless unsolicited emails or like conduct targeted cyber attacks --> so let's say for example you couldn't use you couldn't use who is to go online and find vulnerable --> or weak websites to coerce their owners into allowing you to do uh web design for them that's --> illegal right uh when you're dealing with the legalities of the chain of custody right you --> have to especially for investigations and legal cases you always have to maintain forensic --> integrity by documenting your sources and your processes so like who can who is can reveal a --> lot but it must tread carefully and ethically because ethical hacking starts responsible --> research so your responsible use for who is data is going to be based on how you document it right --> you're always going to record the source url your timestamps for any who is lookups it's going to be --> the verification that you have right how you cross check the who is data with other open --> source intelligence tools to verify the integrity of the findings and just how well you write your --> report right so now we're going to work on applying who is to real world cases all right so --> right now the case that you and i will be working on is investigating a cybercrime syndicate that --> use multiple domains to launch a series of spirit fishing campaigns right so in this process the --> first thing you want to do is you want to conduct a who is look up on the several suspicious domains --> right and these multiple domains are going to give you access to potentially multiple different --> email addresses or one single email address right now if you get one single email address then you --> know which domain you should follow because that's the email address that the person is using to --> communicate with right and then you begin to track the who is uh the who is history to uncover --> the domains past on their geographical ties right you want to be able to know where they --> house their server because if it's somewhere within what we consider the eyes then you're --> allowed to request information for them for cyber security purposes so we have the 14 eyes --> nine six sides three eyes and they're all based on the ways in which country countries were engaged --> with us for cyber security purposes so once you track down that who is history to uncover the --> domains passed on a geographical ties you will correlate this with the ip addresses that you --> found in the domains dns record records and then you will reverse track those ip addresses to --> physical locations does that make sense so each ip address is going to be attached to a physical --> location and you're just reverse engineering back to gain at gain the information based on --> that location so basically what would what would occur in this situation that we just --> explained in the process communicator is that the shared email led to the discovery of several --> other fraudulent domains the investigator would have cross-checked those who that who is data with --> the geographical location to locate a regional server provided which was used by them right so --> more times than not you're going to find a a position and a place for that server and then --> that's where you're going to begin the process of trying to break down and understand the directories --> left in it sound good so now i'm going to give you a site to look at and i need you to give me --> what information is the most important information on that website oh the site i want you to look up --> is www.thesecuritynoob.com so the the website is www.thesecurityth yeah the security t-h-e --> secure noob noob.com so this is a good friend of mine his name is heath --> he wouldn't have an issue with us using his site so what i need you to do is if it's possible --> could you based on this website enter his information into who is and tell me who the --> owner is yep and how long has it been active yep i think going on six years --> so how comfortable do you feel with google dorking and who is so far okay good good good --> so next thing we're going to work with is showdown one second trying to get it to see if it's all let --> me log in one second one second all right so all right so now getting into showdown so as you --> already know showdown is a search engine that indicates internet connected devices --> enables visibility into exposed services such as unsecured webcams industrial control systems --> and default configuration configured servers right so based on the ip addresses and the --> information that you would gain from the dns dumpster right would provide you with these ip --> addresses that would tell you essentially would essentially give you a threat map of that device --> so i'm trying to i have a sample here i'm trying to see if it's going to allow me to pull it up --> for you but it's giving me a hard time so bear with me for two seconds i apologize --> we would like you to see this