AI, Superheroes, and the Coolest Room Ever with Adrian Belanger

Show Notes

Did we intentionally film an AI episode in front of robot monster art? Maybe…

As machine learning becomes more prevalent, the industry is actively exploring how to leverage AI's capabilities to its advantage. On the newest episode of Pipeline Things, Christopher De Leon and Rhett Dotson are joined by Adrian Belanger as they explore the application of machine learning and his IPC 2022 paper: “Not All Data Is Good Data: The Challenges of Using Machine Learning ILI.”

While machine learning efficiently analyzes data and uncovers patterns for analysts, Rhett, Chris, and Adrian discuss the critical role that quality inputs play in actionable outputs. Bad data can arise from mismatches between ILI data and field data, or from incomplete data that has been excluded. Belanger discusses the importance of screening data and looking for anomalies in data when utilizing AI.

Rhett, Chris, and Adrian discuss the growth and use of machine learning, what good data is, and how machine learning can effectively be utilized in the industry.

Highlights:

What type of data is fed into machine learning, and what goes wrong?
Is the future of machine learning to replace jobs in pipeline integrity?
Is parameter data leveraged in machine learning rather than just ILI signal data?
What does good and bad data look like from Adrian’s perspective?
What are the common mistakes with data?

Connect:

Be sure to subscribe and leave a comment or rating!

Pipeline Things is presented by D2 Integrity and produced by ADV Marketing.

D2 Integrity (D2I) is providing this podcast as an educational resource, but it is neither a legal interpretation nor a statement of D2I policy. Reference to any specific product or entity does not constitute an endorsement or recommendation by D2 Integrity. The views expressed by guests are their own and their appearance on the program does not imply an endorsement of them or any entity they represent. Views and opinions expressed by D2I employees are those of the employees and do not necessarily reflect the view of D2I or any of its officials. If you have any questions about this disclaimer, please contact Sarah Roberts at [email protected].

Episode Transcript

00:00 Rhett: Let’s go. The eyes. Let’s do it, dude. All right. Welcome to today's edition of Pipeline Things. I am your host, Rhett Dotson, my co-host, Christopher De Leon. Really excited to be with you. 00:12 Rhett: If you’re not watching the YouTube channel, I- just a word of warning. I don’t know if too many people understand the level of- maybe “nerd-dom” that Christopher and I probably exist in. 00:24 Christopher: If it wasn’t for the podcast, no one would truly know much about us. Rhett: Yeah, it’s true. Christopher: If you think about it. Rhett: If you go back and watch the earlier podcasts, there’s all sorts of easter eggs planted all up in those that people probably aren’t aware of, and we some easter eggs today, but the truth is: Uh, Chris and I are bit of comic book nerds. Um, I was even a video game nerd. Christopher: Yep, and computer game nerd. Rhett: Anime. There’s- in fact whenever we found out that we played StarCraft-together it was like “What??” 00:51 Rhett: Then I found out he collected comics. Man, I still have a stack of comics in my mom’s house, right? And so today, we have the luxury because of our guest. Our guest brought it up because of his own ties to “nerd-dom”. We are shooting in Dragon’s Lair, Fantasy Cards, and Comics in Cypress, Texas off of Tomball. And they gave us this room, which is- is the coolest room we’ve shot in so far. Yes, for those of you- Chris: It is arguable that this was the right room for us to shoot in. Rhett: Yes. Chris: Trip S? 01:19 Rhett: Trip S. did make a shortcoming there, but let’s not share that with the audience, let’s keep her image intact. Chris: Yeah, this is a cool room. We got Zauron, the Nike. Rhett: Yes!!! And Mecca Godzilla Chris: Mecca Godzilla. 1:28 Rhett: And for those of you who know Chris, probably in a past life or worked with him. For all of the ILI analysts who had to work with him while he was at Energy Transfer, it is completely appropriate that he sits in front of the Saron Dark Tower image in this podcast. Like it is completely appropriate. So um, yes, but I love this so Chris real quick. Rhett: All right, lightning round, you like to do this. Chris: Oh, God. Rhett: You have to choose, all right. Chris: I have no idea what he’s doing. Rhett: All right, brace yourself. Chris: All right. 1:49 Rhett: Cause you only get- you gotta answer quick. Who wins the fight? Chris: Oh boy… Rhett: Hulk or Thor? 2:00 Chris: Oh Thor, hands down, no problem. Rhett: Wolverine or Deadpool? Chris: Uh, Wolverine, no problem. Rhett: Spiderman or Captain America? Chris: Spiderman, no problem. Rhett: Are you serious? Chris: Especially Spiderman 2099. Game over. It’s simple. 2:13 Rhett: I don’t know that I agree with your answers, but we’ll keep going. Chris: That isn’t even hard. Look- Rhett: Ironman or Batman? Chris: Oh, Ironman, but it hurts to say that ‘cause you know Batman’s my dude. Rhett: Yeah. Chris: Batman’s my dude. Rhett: Ironman’s my dude. Ok. I- I just wanted to check. I mean, we could waste a lot of time going into your reasons for those choices. I think Spiderman versus Captain America is questionable, but we’ll let it slide. Chris: Ok let’s put it like this, maybe he would beat Parker, Peter Parker, but I don’t think he would beat Miguel O’Hara. There’s no chance. Spiderman 2099’s got him. Sorry. We can debate that offline. Rhett: Wow. Chris: Over margaritas. 2:53 Rhett: Oh, so on that note, I am going to bring our guest on, and we can ask him what he thinks about this environment that we’re shooting in as well. But I am excited. So, we’re continuing our IPC series here, 3:06 Rhett: and I’ll just go ahead and bring him on. Our guest today is Adrian Bellinger from TDW, and he is the author of the 2022 IPC paper, “Not All Data, Is Good Data: The Challenges of Using Machine Learning with ILI”. So that’s what we’re really going to be into in this episode, after we cover some, um, “nerd-dom”. 3:38 Rhett: So, Adrian, welcome to the show. Adrian: Oh, thank you very much. Rhett: So, what do you think about this vibe, this, this environment? Adrian: Oh, this is a great place, man. I just told you guys I was into gaming and Chris said, hey, Dragon's Lair. So, I come in here and it's like this place blows away in any of the places I've been. Usually, it's just a little hole in the wall, some comic books, a rack with some more games. But man, this is decked out, tables everywhere. 03:50 Chris: Yeah, this is awesome. You know what, I also think it's mildly inappropriate. So, as I said, there is another room that I felt was more appropriate. Rhett: Well, you- we- let it go. Chris, let it go. Chris: Hold on, but there is some redemption here, right? And the basis is how many times have we heard that our customers feel like I idolize just black magic? You know what I mean? It's just like. Rhett: So, you want to get into space magic? Is that what you're saying? Chris: And now today the topic is this nebulous concept of machine learning. So, the black magic just gets deeper. 4:19 Chris: So, I do feel like this room between the Night King and Sauron and it just definitely... 04:24 Rhett: Mecca Godzilla. I mean, we have it going on. Chris: The machine learning, the child of potentially both of them, right? Rhett: Yeah, and I'm not sure if that's Frodo or if, you know, umm about to drop the ring right into the... Chris: Gallum. Rhett: It could be Gallum right before. Anyway, 4:38 Rhett: Adrian, welcome. Adrian: Thank you. Rhett: So, we're super excited to have you. So, tell us a little bit about... Go ahead and introduce yourself to the audience, if you don't mind. Who is, who is Adrian? How did you accidentally find yourself on this podcast? All the above. Adrian: Oh, okay. Well, I give you a little background, how I got into ILI. It started off wanting to go into physics and didn't find too much job opportunities about bachelor's in physics. 05:03 Adrian: So, um... So I went back for Medical Physics. Went into MRI, living in Houston. Was looking for other job opportunities in the scope, and there was a strange thing called “Inline Inspection” 5:15 Adrian: Once they showed me you could put these tools and magnets on there, I was hooked. So, I started- Rhett: Really? Adrian: Yeah, so I started at Tuboscope. Worked at Tuboscope all the way up until its- Rhett: And so, Tuboscope, 05:25 Rhett: 90s, 2000s? What is this? Adrian: I started working 96. Rhett: 96, okay. Adrian: So that was when Tuboscope was not acquired by anybody. Chris: That was a line of log. Adrian: Yeah, line of log, that’s right. I remember we had a lot of lineal. We had books, but we still had to go back and get the scrolls and look through paper. And then in order to measure a dent, you had a little cardboard card of a clear plastic with three little lines on it so you had light, moderate. You only had to grade the deepest one in there and then you just only did the wells for the three upstream, the three downstream, so- Rhett: Wow, were you an analyst? Did you work as an analyst? Adrian: I started as an analyst. But then when I started… Rhett: So a degree in physics and you're working as an analyst for Tu- that is incredible. Adriann: You gotta start at the bottom! Work your way up. 6:09 So, but because my physics background started solving some of the problems in life, so I quickly went to engineering. Chris: Yea. Adrian: And started coming up with the algorithms for the, the grading and the sizing- the tools that Tuboscope was using and then Tuboscope had a partnership with NDT 06:25 which is now NDT Global, but this is before NDT Global. And so, uh worked some UT tools and love that stuff. Was wanting to get the synergy of the data but that's when NDT kind went bankrupt, and I came over to TDW because they were like the next generation of MFL tools here in the United States. 06:47 Adrian: And previous, we worked with James Simic. Rhett: Yep. Adrian: He brought the SMFL over there. So, I just immediately attached to that. And they had to, they incorporate the low field magnetization that was, we uh, I had worked with Tel back when I was with Tuboscope. So, I just continued on from there. And we're still running into many of the same problems we had before where you don't get the deep feedback unless it's wrong. 07:15 Rhett: That is the most common theme. Adrian: Yeah, yeah. So as we started moving into the age, but TDW had a lot of dig feedback. So they were actually, 07:26 we were, I should say we, we were actually modeling our stuff on actual field data. Whereas before, I was always making the algorithm software, you use a machine, the pit. And to a pipe, and you do algorithms off that. Send a tool off, do a prediction. Rhett: So looking at the uh, the paper, was that the genesis? Was the genesis the- the volume of- of feedback that you get? Adrian: Genesis was the quality of feedback ‘cause with TDW even though they had lots of dig feedback, the- the quality was still: You get this report, you got depth on there, a picture and you know and the guys in the ditch not the best photographers in the world so you're not getting: “Is that the pit that you measured?” Rhett: If you get a picture. Adrian: Yeah, if you get a picture. Speaking of which like I 08:10 Adrian: remember when we had rubbings. You know they do that the rubbings? Rhett: I did not know they did rubbings in the field. This is- I learned something new today. I had no- I never knew we did rubbings. Adrian: Which is actually very helpful because then you could actually line it up with your signal and go, 08:21 Adrian: Oh, that is this guy here. We were sizing this one here. Rhett: Wow. Adrian: So, there was a lot of human interaction with that. So, coming up in the age of machine learning, 08:31 Adrian: though, you guys start automating this stuff. You can't have, be having an analyst, you know, doing all this stuff by hand all the time. Rhett: Yeah. So, is it fair to say? I mean, is it fair to say, 08:42 Rhett: at TDW, like one of the reasons I chose your paper the focus around machine learning, because that, I hopefully doesn't offend you, but I feel like that is a buzzword in our industry. Adrian: Oh, definitely, yeah. Rhett: I told you before, 08:53 Rhett: It feels like it's thrown around everywhere. I've seen machine learning for nothing more than linear regression, like, how is TDW, or how are, what is your involvement with machine learning and ILI? 09:04 Rhett: How are you using it in the context of this paper? What are you using it for? Adrian: We, it's tailored to the problems that you're trying to solve, and that's kind of what that paper was about. I wanted to kind of, like, demystify it. So, I went down all different types of machine learning. You know, you have your tree learnings, your convolutional networks, it depends on the type of data you’re doing. A lot of the dig feedback we get, the tree, the decisions trees are the better. ‘Cause you’re only getting very sparse information with depth, whereas if you get a full laser scan, then you can start getting convolutional training on it so it depends on the type of data you're getting. So, what we mostly do because we're mostly getting our dig feedback 09:44 Adrian: in the form of reports, we're using like GPT- I'm sorry that's the buzz word in mind- we're using gradient boost which is a type of decision tree for it. Rhett: I had no idea. I honestly wouldn’t have known the difference between the two of those if you would have told me. 10:00 Rhett: In fact, that's the first time I didn't, and you realized you just mentioned decision tree, which kind of made sense, and then you mentioned a gradient one. So, you're telling me that out there, there's obviously different types of AI models. 10:11 Adrian: Everyone, whenever you can tweak or tune a way to train it, you've got a new model. So that's why there's just tons of words. Everybody who does something a little bit different, publishes the paper, and gets a model name after it. So Gradient Boost is based more on the boost. They’re just using a gradient to solve it faster, that’s all. Rhett: Gotcha. So, um, and I’m just curious right again, cause AI is the rage. At one point I remember talking to an analyst actually from BakerHughes and I think she had been there for 30 plus years, you know, and she joked that somebody had said 20 years ago, “Your job will be replaced by a computer”. 10:50 Rhett: She's like, here I am, you know, 30, you know, 10 year, 20 years later. Adrian: Yeah, yeah. Rhett: Is this the type of thing when you talk about using machine learning? Do you, is the vision to replace that? How do you guys see it working? 11:01 Adrian: It opens up more problems that you can solve. So most of the bread and butter by ILI has been corrosion, right? Rhett: Yeah. Adrian: You're just sizing corrosion. But now there are a lot of integrity threats where we kind of like, 11:13 Adrian: We're kind of iffy, of mechanical damage, hard spots, seam welds. You really need the analysts go in there to kind of say well you know this one probably looks more susceptible than that, so, once you get the machining learning in there you start making those problems actually tractable. You get real data for analysts to work with so they can start making more accurate decisions and one of the biggest ones like one of the 11:39 Adrian: most hardest ones we've had to deal with. I know a lot of people probably know about supervised, un-supervised learning. So, the stuff we're doing is supervised because you want, you have labeled data, you know what the actual dig feedback is. 11:51 Adrian: So you're trying to match it. But let's say when you do the traceable, verifiable, and complete records, you get these signals of pipe. They're not, they're incomplete because no one knows what that pipe is. So, you go in there, and 12:02 Adrian: you're trying to take the signals from each of the joints that pipe and find patterns in it. You don't know what those patterns are going to be, you need the machine learning, to go in there and say, this repeats, this repeats, this repeats, it doesn’t show up here. So, that’s different, so if you had an analyst do that by hand, they might find some of it, but the machine- machine-learning can really go down and try and find some real subtle features that the- an analyst might miss. Rhett: Interesting, so it’s like putting an analyst on you know, like maybe a super powered analyst, like maybe giving them a suit of armor or a Baman belt or maybe it shoo- 12:31 No, it's Jarvis, bro. Come on. Aren’t you Ironman? Like, it's Jarvis. How do you not home run that? Rhett: Is that what you see going on? Is that what's going to happen at TDW? There’ll be a whole bunch of analysts in the room. They'll be like Jarvis. Adrian: I like the utility belt analogy. 12:43 Rhett: Jarvis. Jarvis is this corrosion? Chris: It's like, it's not a like... Adrian: Well, you know what happens with Jarvis though, so. Rhett: It's true. He does. Here comes Ultron and granted that’s true. Chris: But one of the things I appreciate about some of the things you said, 12:59 Chris: Adrian, is our customers a lot of times trust the ILI system as a whole, right? And they trust the passion and the innovation that the ILI community has brought for the last 30 years at least that we've had the privilege of a witnessing. 13:16 Chris: Not that we've been in for 30 years, but you know we've worked with a lot of technologies that manifested in the 90s and so. And you mentioned that it allows the analysts to kind of solve different problems. 13:30 Chris: And you can imagine a customer sitting here trying to evaluate five different service providers, five different technology types, all with different offerings, and now we throw AI into the mix or machine learning. 13:43 Chris: And so it might be hard for them to appreciate what problems are actually being solved and I felt like you kind of started going into that. You know one of the things that the way I like to frame it, maybe you can add to it, is it's you know terminology that we're very used to is like an automatic feature search, you know, like customers might be used to that right where you ran an ILI tool you know the tool 14:04 Chris: performed well and the analyst will say, “Hey we can be on schedule for this analysis to meet your reporting timeline because the automatic feature search only produced this many features. So, we’re estimating we can get this out on time”. When we think about machine learning, a lot of times I feel like there’s different tasks, broad tasks that we see that machine learning’s helping analyze, maybe you can elaborate, it’s, for example, classification. Like, are y’all using it for the perspective of like identifying things that like girth welds, and reference points. 14:34 Chris: Or is it more towards saying, this is corrosion. Or even is it a step deeper where you guys are saying, “We know it’s corrosion, but we use our machine learning to appropriately size it. 14:46 Adrian: Um, you threw out multiple different things there, but that’s good because like let’s just take the simple one, girth wells. You don’t really need machine learning for that, that’s actually just simple algorithms you can go in. The patterns are very significant. Now, 15:00 Adrian: if you want to, like, find girth wells that might be susceptible to cracking, that's where you're want machine learning. They could say, okay, I have a probability that this guy may have a cracked head. 15:10 Adrian: These guys don't. And then there's metrics you can do with that. So, you can, you can, they're like called hyperparameters. You can tune, like, I can handle false positives. I don't want to miss anything. Chris: Yeah 15:22 Adrian: So, when you say this is clean, you have, like 99.9 % confident this is clean and then you can focus on all those other guys so… and that's quantifiable, that's verifiable you can measure that. Chris: So, you're also leveraging in- in the idea of using machine learning for this application it's what I almost call like the metadata of parameter data and not just the ILI signal data. Adrian: The- the- the- the- the- each one of them you'd 15:57 Adrian: have a model for each problem you're trying to do. Chris: Yeah. Adrian: And then you're talking about the meta would be like the suite that you have all these guys in. Chris: Yeah. Adrian: And then take even further because I get really data focused I mean so like with ILI it's- it's- you can also do your records. Chris: Sure. Adrian: You can- you can- you can. Uh, I know, like, I think TCE had something at the PPIM about how they actually were using ChatGPT to go 16:18 Adrian: through all the documentation in the company and they can look stuff up, and it's like… There's, so like in our company, there's so much stuff that's documented but it might be over here, it might be over there, and it's and it might be on a different server, so if you have a machine learning that, it's like on the chat GPT level, it's like, okay, I'm looking for this key word and I know it goes to scans all puts it's 16:38 Adrian: a nice little concise report for report for you. Boom. That just made your work a lot easier. And then you're not wasting. Rhett: Dude, their PRCI literature reviews just got replaced. 16:50 Rhett: We just need to take all the PRCI documents, dump them into one thing and have Chat GPT, go through them. Adrian: Well, at the risk offending PRCI, it’s much better than what they got now. 17:01 Adrian: That might be helpful. Rhett: Oh gosh. Oh, gosh. Chris: Well, obviously, we see the advantages of machine learning, right? I mean, if anybody's ever invested any amount of time looking at ILI data, 17:15 Chris: God bless and thank you for the analyst, right, and the algorithm of developers, because sometimes it does feel like a needle in the haystack, right? When you're looking for small indications, and as a community, 17:27 Chris: we rely on innovative tools like machine learning that are able to find patterns in some of the small-scale stuff, right? Or in patterns that are anonymous just visually and manually we’ll struggle with. 17:38 Chris: And I think you kind of mention that, right. It's like where you can apply observations across a larger data set to find truth kind of is the idea, right? And that's what you're pulling out from using the machine. 17:49 Adrian: And the AI doesn't tire. So, analysts may have been doing like overtime to get this job out. So, they're having to scan through it if you have a machine just running through the machine, it's scanned every, it looked at every data point for you, and it's reduced it down to a small subset that the analyst has spent the- Rhett: I like the assistance aspect of it. The assistance aspect of it speaks a lot to me and where its value is 18:10 Rhett: But I tell you what we're actually going to take a break. When we come back audience, we're going to dive into how is the data that's good, what does it look like as it goes into ILI, and what does bad data look like? And we're going to dive into that with Adrian right when we come back. 18:26 Alright welcome back to our IPC series with our guest Adrian Bellinger. So, Adrian, before we get back into AI, I have to do the same thing to you that I did to Christopher. Which is lighting round. I’m not going to ask you about comic book characters though. Adrian: Oh alright, alright. Rhett: You said you’re a board game guy. Adrian: Yep. 18:45 Rhett: Or you said, so let's go. Monopoly or Battleship? Adrian: Battleship. Rhett: That is the right choice. If you said Monopoly, I was going to really... Adrian: Monopoly's caused too many domestic violence fits. 18:59 That's Uno for our family. Uno brings out the worst in people. You can turn uno into a drinking game and it's fun. Yeah, with the blank bar that you write something on? Rhett: Ticket to ride or risk? 19:11 Adrian: Uh, risk. Rhett: Okay. And you might not know these. Adrian: To be honest, I never played ticket to ride. Rhett: Okay, all right, all right. Adrian: I know it’s popular, but I've never played it. Rhett: Okay, there's about seven variants of it. You probably haven't played this other game then. 19:22 Rhett: King of Tokyo or Scrabble? Adrian: Oh, I'd have to say King of Tokyo. Rhett: Oh, all right, here we go. Got a little flavor for who Adrian is. Sorry, I just had to ask, 19:32 Rhett: I couldn't go with any of the real deep board games because the audience wouldn't recognize them. By the way, audience, if you ever did recognize those games and you haven't tried out Tickets to Ride or King of Tokyo, you should give them a shout. 19:44 Rhett: So, or risk, if you must. But risk is more fun on like, we have the computerized game board. It's just, but you know how fun it is to play? We have the, uh, the game board now, 19:55 Rhett: the digital version of it. And you just play like on a touchscreen. Yeah. It's so much faster because you don't have to argue over points, or how many armies go where, or roll the dice every time. It's not AI. Adrian: I love the little cannons and the little horses. Rhett: Yeah. Chris: There's something about the motion where you're trading in yeah pieces to get a cannon. Rhett: But you know what happens? Whoever ends up losing ultimately blames everybody else for ganging up on them, 20:20 Rhett: And quits the game Adrian: That's usually why they lost though. Rhett: Yeah, all right, so, getting back into our, our AI trend here, so let’s talk a little bit because you guys get feedback from vendors, and the purpose of your paper was not all data is good data. So, let’s start with, what’s the gold standard for good data in the context of this paper. Adrian: Well, the gold standard would be to have an accurate representation of what’s actually in the pipe, every XYZ coordinate so that because think of it, it's a digital twin. We're trying to create a digital twin of your pipeline so that you can do an integrity assessment of it. 21:00 Adrian: So, if we're only getting a depth in a box. You're going to get when we, when we model it, you're going to get a pipeline with square metal loss boxes in there. And that's not a good integrity assessment. So, what you need in the ditch, you need some sort of laser scan if you have external, and if you have internal, some sort of automatic UT scanning. Rhett: And so, with that data, so you mentioned, let's take a- let's take a complex 21:25 Rhett: corrosion pit. So, length and width and depth- those three variables on one hand, minimum feedback, Adrian: Minimum, yeah. Rhett: Versus your gold standard which is a fully spatialized representation of the corrosion. What can you do when you have the fully spatialized laser scan? What does that then permit you to do? Why is it so good? Adrian: What that allows us to do then is that there’s a whole group of machine learnings called convolutional networks. And what you can go in there is you can go and process… I’m biased MFL which you can do for UT too to remove just a little bit of inaccuracies you may be having with UT, but UT is more direct Adrian: With MFL, what you try to do is you try to do an inverse model of the magnetic floods leakage to the medal loss. I know Yanis Grossan had a really great paper about how the two magnetic fields is that you can recreate that. It also allows us to do finite element modeling. You can get the magnetic floods leakage from the finite analytic modeling of that metal loss and then start working on inverse models 22:25 Rhett: Gotcha. So, on the subject of data, and I- I- I’m curious, 22:31 Rhett: right? So obviously if you've got the fully spatial data, now it really allows you, what I'm hearing in layman's terms is a much better understanding of the magnetic response and ability to maybe predict or characterize, 22:44 Rhett: We characterize other similar magnetic responses. Is that fair? Adrian: See, we call it the “No Pose Problem” because a magnetic flux that's generated, 22:56 Adrian: can be generated with multiple different geometries. Rhett: Yes. Adrian: So, what we'd like to do is just what is the typical geometry that's of corrosion that's out there? It's more like the most likely geometry you have because the other geometries can be kind of bizarre looking. 23:09 Adrian: Yeah. Rhett: And you mentioned that, you mentioned a statistical representation earlier when you were talking about like this, this feature has a 90 % chance of being this. So, with that, is it fair to say that you see it giving the analyst statistical, like this, this feature, is a 90 % probability it's this type of feature or this classification? Adrian: Yeah, yeah. You can do that, and actually that's where you know Baker does that with, 23:30 Adrian: with their, they're doing it because they have that they have a whole library of that laser mapped data and when you see it multiple, multiple times, you see how likely it is that that the MFL signals being generated by these different geometries- Rhett: Interesting yeah that's cool. 23:45 Rhett: So, on flipside, what does bad data look like? If not all data is good data, give me the bad data. Adrian: The worst cases are where you're getting on your MDE report, you're getting the depth off of what was not called by the ILI. 23:58 Adrian: It's in a different area. A lot of times it's just a mismatch from the field and the ILI data. That's bad data. The other data is not necessarily bad, but it's just inappropriate. Like if you're going in and you're only measuring the deep stuff, 24:12 Adrian: But there's some light corrosion on there, it's just missing data. I mean, you have the ability to get that data, but it's not there, which also makes it more difficult to match it up with the ILI if that data is being excluded from the report, so you don't have a reference to match it up to. 24:29 Adrian: So, it's more missing data. Rhett: So how does that function in AI? I assume when you're doing these types of models, you guys must have to heavily screen the data before you feed it to the AI? Because you got to make sure that the AI is understanding it. 24:40 Adrian: Well, that's one of the problems we get is… we get these things on a paper report. They get hand-entered, so you can see there's errors in there. So, on the machine learning side, when we're actually training it, 24:51 Adrian: We're looking for anomalous ones. We have to go back by hand, look at the report, look for the picture, look, and make sure that, you know, somebody may have “fat-fingered” the depth. So that data has, 25:01 Adrian: So, it's the curation process. If you like, if you look in any of the kind of one of the machine learning, where they break down the tasks they're doing, 25:14 Adrian: A very small part is done on actually training and building the model. Most of it's data curation. It's knowing your data and making sure your data is clean. Yeah. Rhett: You know, 25:25 Rhett: Another question I have for you. This is intriguing to me because you mentioned earlier in the first half of the podcast. You said something about the fact that, and it's on my heart, 25:36 Rhett: We only get data when we're wrong, right? And I felt that when I was at Rosen, I used to tell operators, it's almost like we create scarred… we create scarred analysts, because if- the only time you ever call your analyst is what he's wrong is to berate him. 25:50 Rhett: Yeah, they don't want to talk to people. They always think every time you call. Chris: Or to ask them how the reports going. Rhett: Yeah, right. If those, if you get nothing else from this podcast, please call your analyst more than when they’re just wrong or to ask them about the report. Chris: Think about mental health. Rhett: Yes. Yeah, exactly. Yes absolutely, it’s Pavloving response that we’re creating their but it’s not a salivating dog, it’s a scared analyst. But, on the data side, I’m curious, if all we ever get is feedback, when it’s wrong, 26:19 Rhett: does that bias the models? Right? Because to look at a data set, don't you need a data set that covers both ends of the spectrum, the good and the bad? Adrian: You do. Yeah, you need data that just covers everything. 26:30 Adrian: What's that it doing? Maybe it doesn't really bias the model so much, but it becomes ineffective, because you're not really knowing how off you are on the good stuff, because you could actually improve your accuracy on there. 26:42 Adrian: Another fact, too, is like when we get most of the dig feedback, it's usually between the 20 and 40 % depth range. What you really want to do is be accurate on the 60s and 80s. And you don't get a lot of big feedback on that because usually they get repaired right away. 26:58 Rhett: Interesting. That's true. So, they go in the ditch. They don't see a need to scan it, because they know they're going to put a thing, so don't spend the money on scanning it, just put the sleeve on it or cut it out and move on. Adrian: Yeah, yeah/ Rhett: That's interesting. Chris: When we think about machine learning, I often think about like whenever we have a growth opportunity or an evolution of technology, there's value created. In your opinion, do you think, um, that value do you feel is more like received by the analyst right now like they’re more excited about it, because they’re able to say, “Oh, look at all these cool things I can do now”? 27:33 Chris: And maybe that doesn't translate to the customer as directly, or do you feel like maybe your feedback that you've got from the market is, the market's really appreciating all of this AI, 27:43 you know? Are you all the, which one of those two do you feel like? Adrian: I think it's always; it's always marketing-driven. So analysts do want it, but analysts are skeptical. 27:56 Adrian: Because they're going to, it's the, I'll wait and see, you know. Is it really going to be that accurate? Rhett: Is this taking my job? Adrian: Well, no, it’s more of, “Can I trust it?” Rhett: Yeah, that’s true. Adrian: When it, when it gives me this depth, and I’m looking at it, do I trust me as an analyst, or do I trust this AI. Rhett: You have no idea how close that is to my heart, because that is a recency bias, right? So, whenever you- when we draw on information, 28:20 Rhett: Like fun fact, if you want to know what people think about something, you ask people to think of a number between zero and 20 and then ask them to give you an answer on something totally irrelevant, something else, like, what do you think the population of something is. A percentage of this people- they will always give you an answer that's closest to that 28:38 Rhett: Last number they thought to because our mind draws on recency. The same thing I witness happens a lot of time in engineering assessments or analyst, right? So, if you bias the analyst with an initial indication one way, 28:50 Rhett: We saw it on bending strain whenever I did it, where I said if you shade something as a field bend, the analyst will be inclined to think it's a field bend, when in reality it might not be. So, I think it is actually, 29:00 Rhett: There's something to be said about how AI has to give the right indication to the analyst, or the analyst- Adrian: And I'm susceptible to that too, though, because if you know, the big breakthrough like Chat GPT's attention. 29:11 Adrian: That was the big paper that came out. Attention is all you need that, Google did. So, what it is, is the window of attention that you have. If you talk to chat- with Chat GPT, after a while it starts to forget the early questions you have because there's a window that is relevant to it. 29:26 Adrian: And the same thing happened like in the early days when we were doing like neural networks. We were training on these features and then retrained it on another set of features. It lost its accuracy on the old features, because it's having to adjust its weights for both of them. So, when you're modeling these things, you need to make sure that you're always incorporating- you're weighting things properly incorporating all of 29:49 Adrian: The data that you have so it doesn't forget things. Chris: Because obviously with our, with our previous employers, you know a lot of the value that I personally- these are personal experiences guys, right- it's I saw a ton of value on the analysis side, 30:03 Chris: right, in creating reliability, creating consistency, helping analysts go through complex processes more efficiently. So, like through classification and grouping, 30:15 Chris: You know, so you let, instead of an analyst, think of like, in the early days, maybe even some degree now, you start with the launcher and end at the receiver, right? So linearly you're like, oh, I've seen this pattern. Let me apply what I've seen currently. 30:27 Chris: Whereas some of these machine learning classification technologies can say, “Hey, they grouped all these for you already.” Like all these features are near girth welds, right? And now you’re just, you’re punching out only the features that are within some tolerance of girth walls, so your attention, is on, right now I’m analyzing features in our girth walls, not, I had a girth weld, now I’m the middle of the pipe, now I saw a dent, now I saw something weird. Oh, and I’m back at another girth weld Adrian: Yeah, yeah. Chris: We were able to focus your analysis, not across threats, but across different classification types, right? Not classification as is this metal loss or not, but this type of interaction or this type of flaw. 31:03 Chris: And so, what it did is we saw all these opportunities for it to create reliability, consistency, and efficiency in the analysis process. And as a customer, when you think about maybe the marketing or the value add, 31:15 Chris: It was, “Well, I thought you all guys were always good”, you know? And they go well this is just a step forward to ensure it, yeah? Adrian: Yeah. 31:24 Chris: That’s why I brought the question around, you know, do you feel like the analysts today are more excited, or maybe the clients? Because I feel like maybe it just depends, you know? 31:33 Adrian: The analysts are excited they can take work off their work load. Chris: Yeah. Adrian: Yeah. And they have developed a trust in it that they can just go forward, 31:42 Adrian: They don't have to double check it. Chris: And then obviously as we get more data sets and you're interacting different signatures and you're trying to bring all those together into one plain view, 31:55 Chris: Then obviously machine learning starts helping with those type of things. And so, I feel like that's maybe the next step. Adrian: But I think what we'll see then is evolution of the integrity assessments, because integrity's still doing, we have these grouping rules, right? These boxes and stuff like interaction? Chris: Interaction. Adrian: Interaction rules. When you start looking at the actual corrosion itself, you can actually just start modeling the actual integrity straight from the actual corrosion profiles. Chris: Yeah Adrian: Kinda like the way P Squared is going. Yeah, you know you’re doing river bum, but you’re taking out everything. Rhett: So, Adrian, as we close out, I'd give you a chance to think about this. 32:32 Chris: Oh, I got a finisher for him, too, so you go ahead. Rhett: Three, okay, I was going to give three requests you would make of operators for good data. What would be your top three wish list? 32:42 Chris: I was kind of similar, but I'm still going to ask my anyway. Rhett: That's fine. If you were to tell me, hey, if you do nothing else, this is what I want most. Chris: Yeah, you gave him three, so I'm going to piggyback on that. Let's say you collected all the dragon balls. 32:55 Chris: Because that's what I was going to do. What would you wish for… in this subject… on this topic, right? Obviously, you have to- Rhett: Well, it's good. You get three wishes. There we go. We'll give you three. 33:05 Chris: Yeah, but now you're getting three wishes. Rhett: If you can't think of three, you can do one and take it. Adrian: No, three is fine because I'll break it up in the different integrity threats. I think I'm always going to fall back on laser scanning. 33:16 Adrian: So, getting the laser scan for corrosion. And then the second one would have to be for the automatic UT for. Not just going in with the “Let me find out where the deepest depth is with a hang or gridding it out, but getting, getting that full matrix of what does the ID look like. And then the same thing for dents, too. 33:39 Adrian: You know, that way we get away from the bridging bar problem, because we’re always having to calculate… Rhett: It’s as long as longest bridge Adrian: Yeah just, just mapping out what is that actual XYZ of that outer surface of the pipe. Rhett: Kay. Well before we go I have one last question, but it’s not to you, and it’s not to Chris. It’s to Trip S. Alright are you ready? Sarah: Ok Rhett: Um, and I’ll read the, Chris you can repeat our audience because they probably can’t hear. Trip S, here we go. 34:07 Rhett: Lord of the rings or Harry Potter? Sarah: Harry Potter: 34:10 Chris: Harry Potter, she said? Sarah: Harry Potter, 100%. Rhett: That is, I almost want to stop there, but we won't. Chris: Wait, are you going to ask who will win the fight in either one of those? Rhett: No, there's no, there's no question. Sarah: My fiancé is the Lord of the Rings guy. 34:23 Rhett: Well, he's a good man. Star Wars or Star Trek? Chris: Star Wars or Star Trek? Sarah: Ok, my dad's a Star Tre -y guy. Like, I have to say Star Trek. Rhett: Okay, that's defensible. Because she probably hasn't watched either one. 34:35 Rhett: That's what I took. Audience confirmation. She hasn't watched either one. Marvel or DC? Chris: Marvel or DC? She was like, what are you talking about? Sarah: Marvel for sure, DC should do, I have strong opinions on this, actually. Chris: You have strong opinions on DC? Rhett: Well, you know what? You might audience have to come back for another episode, maybe a guest episode where we interview Miss Producer and talk about the difference between Marvel and DC. But the time being, 34:59 Rhett: Adrian, I want to say thanks for joining us on this show. It's been a lot of fun. I want to say thanks to Dragon’s Lair for letting us have this, use this beautiful room. And we hope you enjoyed us as we nerded out a little bit with Adrian. 35:12 Rhett: Catch us again in two weeks on Pipeline Things. All right, this episode of Pipeline Things was sponsored by D2 Integrity, produced by our executive producer Sarah Roberts. 35:23 Rhett: We'd like to give special thanks to Dragon’s Lair in Tomball for letting us use their rooms for the recording and to our guest Adrian Belliger for being on this episode. Thanks for listening.

Previous Episode Next Episode

AI, Superheroes, and the Coolest Room Ever with Adrian Belanger

Show Notes

Episode Transcript

Other Episodes

The spawn of the updated gas rule with Brian Jimenez

The (Humble) Dirt Merchant Returns to Discuss LiDAR

Catch and Release with Rosie the fly