Episode Transcript
Rhett
On this episode of pipeline things, we talk rise of the machines the t zeros t 1000s Everything you need to know about machine learning and how it's taking over integrity. No, but in all seriousness, We talk with Cliff Maeir on top10
Chris
If you didn't read the paper. We got the cliff notes.
Rhett
Oh my god, that was clever
*intro*
Rhett
Welcome to the PPIM edition of pipeline things we continue on our journey of Interesting papers that we are examining to look at and today our guest is Cliff Maeir, and he's gonna be talking about using machine learning to manage SCC data strategy and best practices. So, while I do expect that we're gonna have a phenomenal conversation around a hot topic, that paper title is lame.
Chris
But that shirt is not.
Rhett
That shirt is fire.
Chris
That shirt is fire, bro.
Rhett
I do agree with that.
Chris
Kudos to you, for real.
Rhett
That's true, well done.
Cliff
Thank you, props to my wife and my daughter.
Chris
Yeah, my wife still isn't able to dress me. She's trying.
Rhett
I mean you could have added something like SCC, or the machine's taking over SCC, rise of the machines, playing off of Terminator.
Chris
Is this T1 or T2? Or T0?
Maybe T0?
Cliff
Stay tuned. Yeah.
Rhett
Is it gonna melt down and take us over? Point is, I would like to work with you on future titles for your papers to make them more come out and punch people in the face, grab attention.
Cliff
I would appreciate your input, yeah.
Rhett
Would you really?
Cliff
Absolutely.
Rhett
Okay, cool. And then you can help me with color schemes because I can't pull off pink, but clearly you can, so well done.
Cliff
Thank you.
Rhett
To our audience out there, you're gonna have to look at the YouTube to get any of what we're talking about.
Chris
How are you not watching the YouTube's yet? Well, I guess it's easier to listen on Spotify.
Rhett
I mean, all the cool things are.
Chris
It makes sense. It's easier to listen to.
Rhett
So Cliff. Will you set the stage for us a little bit?
Chris
Give us the cliff notes.
Rhett
We know a lot about SEC. We've covered that a couple of times. Audience, I'm gonna assume that you're familiar with SCC. If not, we have past episodes on it. But tell us a little about who you are and how did you get pulled into SCC?
Cliff
Absolutely. I started my pipeline career about 25 years ago. I started working for CC Technologies and initially got into things like finite element analysis and then laboratory work, looking at corrosion experiments and then cracking experiments. Eventually went into pipeline integrity, doing fitness research type assessments and really developed an interest in the cracking threat. So working with the likes of John Bevers got to learn quite a bit about stress corrosion cracking.
Chris
Can I give you a fun fact?
Cliff
Absolutely.
Chris
I bet you less than two thirds of our listeners know who John Bevers is.
Rhett
I've heard the name.
Chris
Oh, there you go.
Rhett
I know. I know. I don't know him personally, but I've definitely heard the name
Cliff
he's well well done within the industry and Definitely very involved with NACE and other organizations. So very well respected in the-
Chris
Another one of the stages
Rhett
Think we could get him on the podcast?
--
I am just kidding, keep going
Cliff
So roughly five years ago Saw an opportunity at TC Energy for the integrity management team. So they had an opportunity for me to be part of that team for stress corrosion cracking threat management. So I jumped on that and I've been there ever since. Still in the same role.
Rhett
Fantastic, so sidebar conversation. You served on the consultant side and the operator side.
Cliff
Yes, I did.
Rhett
Do you like the switch? Has it been good for you?
Cliff
It's a good switch, yeah. Being on the operator side gives me different insights and different perspectives now so it's nice to kind of see a whole picture
Rhett
Yeah yeah bet it does so tell us a little bit about this paper right again because the subject of SCC has been out there for a while right we have various susceptibility models so when I see that we're using machine learning to manage SCC what are you guys trying to do we're trying to just give me
Chris
What's the need, why is it cool?
Rhett
I mean yeah can't we just find 60 60%, find out where we have soil types and move on? –
Cliff
Well, SCC is a complex threat. There's a lot of complex factors that go into where the cracking develops and grows. So one of the assessment methods that we use at TC is stress corrosion cracking direct assessment. So I know a lot of operators use that method. And in that method, we utilize excavation data to integrate with inline inspection data, and so on. So we can understand—
Rhett
Oh, y 'all integrate with inline inspection data, though, right? Y 'all are not pure DA where it was like, guess and check. Guess here. We think it exists there. Go dig. It doesn't. Like, y 'all are actually integrating it with ILI data. I assume EMAT on
Cliff
Uh, not so much EMAT for, for DA, if we, if we have an option to run an EMAT tool, we may choose that option instead of DA.
Rhett
Because you're integrating other like MFL,
Cliff
Yeah, that's correct. So, uh, with DA, we typically are looking for conditions that are favorable for cracking to develop. So we're not actually chasing a specific crack anomaly that we know about. We're just looking for where all the factors coincide. And And so that's where the machine learning comes in because we wanted to look at machine learning as a way to analyze our big Database of dig information And see what patterns we can identify in the dig data for where we find the cracking
Chris
And that data set does include your in line Inspection EMAT history
Cliff
Well that does yeah, yes, can I
Rhett
Can I ask, is it proprietary? When you say big data set, how big is big data set that you're feeding in here on excavation history?
Cliff
It's hard to put a number to that.
Rhett
More or less than a thousand.
Cliff
It would be more than a thousand.
Rhett
More or less than 10 ,000.
Cliff
It's probably a bit less than 10, but that's a good range. But we're talking, in the data set, we're specifically talking about here today. It's roughly probably 2 ,000 digs dating back to 2012. And these would cover a various array of reasons to dig. So we would have digs that are based on metal loss features, digs based on dents, and other reasons, not specifically just for cracking.
Rhett
Got you. So you collected, you've done these excavations over the last 20 years for a variety of reasons, you're taking all the variables and then feeding those into what you're calling a machine learning model to do what?
Cliff
That's correct. So in terms of our SCCDA program, as I said, we conduct excavations where we think cracking is most likely to occur. Well we use a scoring model for that and In the scoring model we use different data attributes such as topography, drainage characteristics, whether there's a dent present from a previous ILI or so on. So when we look at the scoring model and select the dig site, it's oftentimes we find that There might be two or three locations that really seem like good candidates. So unless we're digging multiple sites It would be helpful to really distinguish a little bit better Which site might might be might be more appropriate where we're more likely to find cracks
Chris
Enter machine learning
Cliff
Enter machine learning. So machine learning we think can help us Be a little smarter about finding those locations.
Rhett
Right. So if what I'm hearing and correct me if I'm wrong, right now you'll have not an arbitrary waiting system, but a waiting system of these factors in some form that was thought about, but we know these waiting systems are judgment based. Are you trying to find something smarter than your current waiting system to improve your dig locations? I'm trying to define machine learning in a simple concept. Absolutely.
Cliff
That's absolutely right. So the machine learning will help us understand the relative importance of the different parameters.
Rhett
So now these digs going back 20 years that you've fed into this, one of the things that immediately stands out to me and I'm just curious, every data set I assumed did not have the same information pertinent to it.
Cliff
That's true.
Rhett
So that seems like that even be a challenge. Like I could see like, oh great, we found SSE at this location, fantastic. They didn't get information on X variable, right? How whenever you're working with the data set like that, how did y 'all compensate for the fact that you might have gaps in various data entries to the machine learning model?
Cliff
Well, I think we try to minimize the gaps, obviously, but inevitably, there are some gaps. So the type of model that is chosen will hopefully account for some of those gaps and do a good job of managing those. So that will shake out when we validate the different machine learning models.
Chris
So if you go back just a little bit, right, so back to why. You guys are in a DA program, you have to do excavations, you have data you're trying to render to understand which ones have a higher priority in establishing where to dig. So you have all these dig, you know, previous excavations, you have data, you guys are exploring machine learning to and which parameters are better indicators for you to say, if I have this, this is likely where I will find a crack if a crack exists as part of my DA program.
Well, that sounds fun. Absolutely. Wow. What did you learn?
Cliff
We learned a lot. I guess the biggest thing we learned is that we validated this machine learning approach as a good proof of concept that machine learning can be used for this purpose. Nice. We also learned that the machine learning is telling us a lot of things we already knew about stress corrosion cracking. So a lot of the same susceptibility factors that you would see from ASME B318S or the SEPA recommended practice document or the NACE document, They're all consistent, so the machine learning model just reinforces some of those things we already knew.
Rhett
Well, you say that, but I wanna help the audience understand it's a cliff, and this came straight from the presentation, audience, if you weren't here, I'd encourage you to take a look at the paper. It kinda looks to me, real quick, that's 24, 28 variables that y 'all threw mud at the wall. That's kind of this, the initial variables you threw at the machine learning matrix look like mud on the wall. I mean, it's got things like years without CP, surface preparation, that's interesting.
Chris
Topography.
Rhett
Hard spot. I mean, things that make sense, but pressure cycling, I've never heard of anybody try to tell terror, you know, liken pressure cycling to stress, corrosion, crack.
Chris
Depth of cover.
Rhett
Depth of cover, yeah. I mean, river crossings, ground water, y 'all through the book at the machine to see what it would find. And then it came back and told you, oh, here's what We didn't that didn't surprise you that you didn't find anything like really, really new. None of the variables that threw back at you.
Cliff
There, initially there's, there's nothing obvious that jumps out as being surprising. So, uh, as we continue to evaluate the model, we may learn more things that maybe we didn't know before. So for example, there could be interrelationships between some of these parameters that may not be evident and some some of the guidance documents that are out there.
Rhet
Did you come back and I mean, like Flex, like see us old guys, we developed this with all this fancy machine learning and we got the same answers.
Chris
Well, we haven't gotten to like the top five—
Rhett
We're talking to John Bevers, come back on you, give him the date and John Bevers is like, yeah, see, I told you, you didn't need to do all that. It was good enough where we had.
Cliff
Exactly, yes.
Chris
I kind of want to start asking them about what they found from like which parameters are the most important.
Rhett
Oh, I think that's a good thing to do.
Chris
Wait, I can't ask before we take a break.
Rhett
You can ask the question and then I'll force the break. Since we're going to have this, go ahead. Ask the question.
Chris
What parameters did you find surfaced that were better correlated to where you'd find cracks?
Rhett
And this is where I flex and I tell you not to answer that question until we get back from the break. Audience, we will be right back and we'll hear Cliff's thoughts on what they found in terms of most important parameters.
Rhett
All right, welcome back as we continue our discussion with Cliff Meyer on the subject of SEC and machine learning Terminator 1 .0 might be with this episode is called still a TBD But just to recap for the audience what we did so Cliff Meyer, which I'll do on at TC is taken a large pool of information from previous digs for SCC, where you had confirmed SCC, taking every variable that you kind of could have collected from those digs in the last 20 years, some that you suspected, some that you maybe didn't even know about, threw them at the wall to see, hey, which of these variables, when we apply like a machine learning type model, begins to identify factors that stand out. And are those factors better than the, let's maybe say somewhat arbitrary weighting factors that we would have selected or we've been using perhaps for 20 years. So like in comparison, audiences give you some idea. If you read 31S to ask, it has a section on stress corrosion cracking, right? And it says like operating stress above 60 % my soil topography, but it provides no weighting factors. - Right. - So You guys have taken a step further, have some waiting factors. Now the machines have risen up. They have provided us with even better waiting factors, so we think. Is that fair? If I set the stage?
Cliff
I think you have.
Rhett
So Chris, you asked a question before break, which I completely with the authority that
was given to me as host of the show—
Chris
As the story news, as night of pipeline things.
Rhett
So we were going to a break. Would you like to ask that question again?
Chris
Sure. What predictors did you find after you guys went through the machine learning, did you feel, did you guys think were most relevant? And were you surprised?
Cliff
Uh, I'll answer your second question first. No, we weren't surprised.
Chris
Um, is that surprising?
Rhett
We found the old guys got it right.
Cliff
What we found were that the, the predictors that the model was saying we're most influential, um, actually aligned quite well with the guidance and literature that's been in the industry for years so
Chris
That's comforting right that kind of makes us think like whatever we've been doing is still on par right it's it's okay it's not the skies falling
Rhett
I think it's good that it's kind of um it's objectively validating what our experiences have been which I mean that's kind of what I see so I think I'm with I'm with Cliff Chris I don't I'm
gonna say Cliff Chris
Chris
You know that the OPPs had it right.
Rhett
Exactly.
Chris
Right, the old professional pipeliners?
Rhett
I wasn't going to say that. That was on you, man, if you wanted. We don't have an OPP congregation.
Chris
We got to change it up a little bit, right? We don't want to follow perfectly.
Rhett
All right, so just talking a little bit about them. Top 10 machine learning predictors for SCC. Are these in rank order by any chance?
Cliff
Yeah, they are. They are. And the top 10 that we found based on the data set that applies to our system. So—
Rhett
Oh I like how you used that
Chris
That was so clutch. You are a pro at this.
Rhett
Wow, he totally clarified it. It was like fine print inserted into the—
Chris
It's almost as if you'd even worked for an ILI vendor before. This is what works for our system. That was good. Keep going. Keep going. That was great.
Rhett
Data, asterisk, only applicable for TC systems in certain circumstances when applied with a correct machine modeling and the data fed into it from the from 2006.
Cliff
Thank you. So, so we're talking about the top 10 predictors. And, and to the last point here, many of the predictors align very well with what's in the industry documentation. Yeah. So for example, soil type, soil texture, operating stress level.
Rhett
Yeah, those are all ones definitely we know about.
Cliff
Coding type. Distance to nearest compressor station. Yep. Soil topography, I'm sorry, site topography. The precipitation, the clay composition of the soil and temperature.
Rhett
So I wonder if clays, it's interesting to theorize here, but I wonder if clay is leading to the degradation of the coating because it's so stiff around the coating and has the ability to degrade the coating. I wonder what but the reason the rationale would be, or is there a better rationale than what it, why is clay a factor?
Cliff
I think clay is a factor because it has to do with the absorption of moisture and the ability of the clay to impose stresses on the coating and its ability to degrade the coating based on the moisture.
Rhett
Interesting, just observations. I don't wanna ask you about it, if you don't mind. So it's cool, you'll have annual precipitation on here, which kind of makes sense because we know that water is a factor in corrosion, so we would assume that in that susceptible environment, water is a factor. But then you didn't find things like groundwater or river crossings to be a factor in that. Do you think that there's anything to that or is it just separation? Does it matter how the water actually gets to the pipe?
Cliff
Well, to be honest, The couple of things that you mentioned may not have been included in the model that we considered here. But I will say that we did incorporate data from public data stores, which could fill in some of the gaps from what TC has in our database. So we don't have all the data from our digs that we uh, maybe consider in the model. So we had to get information from data stores. Yeah. So this is publicly available information about climate. So we're talking about temperature, um, uh, and, uh, precipitation. And then also we're talking about Sergo data. So, uh, as much as we have soil characteristics from the dig information, there may be some is that we can pull from the database, sorry, the publicly available store, that maybe fill in some of those gaps as well.
Chris
And in fairness, these are just the top 10, not necessarily all of them.
Rhett
That's correct.
Chris
And I think just as a reminder, you didn't say—
Rhett
We're not sharing the other 10, you know, they're only applicable for their system, that's proprietary.
Chris
But obviously, the idea is, I mean, to some degree, these are almost weighted, right? So for example, if you were to say soil texture was the top rank that probably you're saying compared to the tenth one, it's significantly higher weighted for an SEC predictor. Is that correct?
Cliff
I think that's correct. Yes. So the ranking is really based on looking at multiple models. And the more significant a parameter is for for all the models, the higher ranks in the list.
Chris
Oh, that makes us ask. You said models, not model. What's that about?
Cliff
So in the development of the model we ended up using, we tested various types of models. So there's more than one machine learning model that's out there or more than one method. And so we can see—
Rhett
There's 100, there's a T zero, there's the liquid guy.
Cliff
We can see the—
Rhett
Cliffs of Cliffwood right past us.
Cliff
Oh, sorry.
Chris
No, no, no, get to it. Get to it. Yeah, so there's multiple models.
Cliff
But we can evaluate the different model types and see which ones give us the best predictions.
Rhett
So and how, let's talk about that. How did you judge best predictions? 'Cause now I'm curious about that. So you've developed these weighting factors, you've listed them here, they're not necessarily completely different than what we knew, how did you determine which model is or isn't best?
Cliff
Right, so for machine learning models, there are different metrics that are used to evaluate the efficacy or the predictability of the model. How good does the model predict against the test data? So where we actually found cracking, how much of the time were the predictions correct? Yep. So these metrics would include accuracy, sensitivity, and specificity. So bottom line is we're above 85 % on all of these metrics, which says that the model is very good at predicting and can be reliable.
Rhett
So let me be clear, though. So you had 2000 data points. Did you had a separate data set that you you tested it against?
Cliff
So of the 2000 data points, we split that into a learning, a training data set and a testing data set. Exactly.
Rhett
Out of curiosity, was that a one -time split or did you test it like numerous different ways? Like let me train it on this portion of a data set and test against this, split it differently, train on this, test against that, split it again differently, train on this, test against that.
Cliff
We tested a couple different ways. I wouldn't say it was very But we did test a couple of different ways to see what the most appropriate breakdown was and and for us it made sense that the the EMAT digs Information or the the records from that we comprised the test data set And and we would learn based on all the other dig data sets.
Rhett
Gotcha. So Now that you've you've gotten the rankings you've shown that at least in the simulations, it appears to be good. Where are y 'all going from here? When are you planning on rolling this out in real world trials? Or how are you planning on doing that?
Cliff
Well, right now we're actively looking at applying the model to our DA or direct
assessment program for SCC.
Rhett
For this year coming up?
Cliff
It's a little bit late for the 2025 digs. Moving forward, we're going to use it for the next cycle. And we think that that's going to help us select the digs.
Chris
We need a set of outlook reminders and ask them in 2026.
Rhett
Will there be a IPC follow -up paper? Should we expect to hear more from Cliff on this subject?
Cliff
I think there could be a part two. I think there could be a part two.
Rhett
And so when you do this, do you envision-- because right now, you'll have a selection process. Do you envision relying on that, but then saying, hey, here's two or three other digs that the model suggests might be places we should go look and you're gonna incorporate those? Are you just gonna wholesale jump ship and go to the new waiting system?
Cliff
I think it's possible if the model is directing us to a location that maybe wasn't our first choice, maybe we'll consider additional digs just to validate that, right?
Chris
I mean, but it's gotta be fun, right? I mean, you go through this whole effort and when you get to what I'm gonna call the performance of it, you're getting near 90%. I mean, that's gotta be pretty exciting, right? You're like, man, this is a good project. We did well.
Cliff
Well, that's very exciting. And we think the model is very applicable to something like direct assessment where we can pick a dig site. However, when we consider something like EMAT digs, we wanna be, it'd be nice to use the machine learning in that regard.
Chris
To Incorporate That is a data set to make also increase performance of your hit rate on EMAT digs. At this point, though, I think it's probably best to use it to gauge expectations for what we might find, as opposed to influencing the decision making on whether to excavate that.
Chris
That is a great point. That is a great point. We have the EMAT signal. You can't necessarily be fully biased, because this machine learning is saying, hey, all the predictors aren't there. You're like, I'll consider it, but I'm still going to make sure I'm verifying or validating the EMAT results. Yes, and we know that the model-- At least at this level of maturity, right?
Cliff
True, yeah. We know the model is pretty good, but it's not perfect. So there are still cases where the model might say, there's no SEC, but maybe we found SEC.
Rhett
Yeah, but so none of the DA models are perfect, right?
Chris
Yeah, But it feels like a win. So I would ask this, maybe if there are other operators or others trying to go in this effort, any lessons learned, anything that you were like, yeah, this was really important that we figured out in this process?
Cliff
Well, I think it illustrates for the industry that machine learning is definitely a legitimate approach that can be used to help manage SCC other threats. Yeah. So, um, in this case, we applied it to SEC, obviously, but it can be used to manage other threats. And there you go.
Rhett
So you're opening the door. I think this is a good topic to close in on. You're saying other threats. Do you have ideas? I mean, are y 'all thinking about maybe nonaxial cracking or where do you kind of see this going in the future, Cliff? Beyond testing it. Do you, are you going to try this on other threats, do you think?
Cliff
It's possible. I'm not the decision maker there. I think we'll definitely share this, the learnings here with other threat teams. We'll see where it goes.
Rhett
If it works for Axial SCC, I mean, it seems logical that you might extend it to circumferential SCC, for instance.
Cliff
We're looking at that as well.
Rhett
We're looking at that.
Chris
Keep digging, right? Keep digging, yes.
Cliff
Circumferential SEC may be using it to prioritize segments for EMAT inspections. Yep. Oh, now that's-- see,
Rhett
I like that. Prioritizing segments for EMAT inspections.
Chris
Keep digging, keep digging. You're doing well.
Cliff
I'll give you one more. Oh, validating risk results.
Rhet
Yes. Oh, that implies that you did a risk assessment first, so—
Cliff
Sure.
Rhett
I'm just joking. But that's one of Chris's, like, trigger points, right, is performing risk assessments. So, well, hey, Cliff, I want to say thanks for joining us. I really, I look forward to seeing where you'll go. I hope there is a part two. If you need help with the title, let me know. I'm happy to do it. So that was judgment day. That should be it. Machine learning, judgment day, results of the assessment. See, I've given you the name of the paper. - There it is. - Of course, to an IPC.
Chris
IPC 2026.
Rhett
You know what's good? Don't lie. And then just give me props. Like, when you give a presentation, I'd like to thank you.
Chris
Don't give him props. Don't give him props. Do not do it.
Rhett
Absolutely. Well, hey, thank you for joining us. We appreciate it. To our audience, thank you. Hopefully, we got to see you at PPIM if we don't. There have been some great papers and hopefully catch you around. Thanks for joining us on this episode of Pipeline Things.