THE AMERICA ONE NEWS
Jul 16, 2025  |  
0
 | Remer,MN
Sponsor:  QWIKET 
Sponsor:  QWIKET 
Sponsor:  QWIKET: Elevate your fantasy game! Interactive Sports Knowledge.
Sponsor:  QWIKET: Elevate your fantasy game! Interactive Sports Knowledge and Reasoning Support for Fantasy Sports and Betting Enthusiasts.
back  
topic
Forbes
Forbes
15 Aug 2023


Blurred crowd of unrecognizable at the street

Mega-personas being derived by prompt engineering while in generative AI.

getty

Go big or go home.

I’m sure you’ve heard that oft-used saying. The roots of this now-famous idiom notably trace back in time. All in all, the implication of the advised doctrine is that you ought to go all-in or not even try to begin with. Pack up your bags if you aren’t willing to do whatever is necessary to win the game, score the big goal, or otherwise reach your highest aspirations.

Look upwards and keep your gaze high.

In today’s column, I am continuing my ongoing series on prompt engineering and will be showcasing the latest in a go-big-or-go-home strategy that you ought to include in your generative AI know-how. The deal is pretty straightforward. In prior columns, I’ve discussed the use of personas as a prompting approach, namely telling generative AI to pretend that it is say a lawyer or a medical doctor, see the link here. I ratcheted up this handy technique by indicating that you can do multiple personas all at once, see the link here. For example, you might use a multi-personas prompt that instructs generative AI to pretend to be a group (or it is a gaggle?) of lawyers that debate a legal case from opposing sides, see the link here.

Well, now I bring to you the mega-personas approach to prompting.

Get yourself ready for a bigger is better mantra.

You see, in contrast, the usual multi-personas are typically undertaken at some relatively modest number of personas, perhaps a handful to maybe ten or so personas tops. This seems sensible in that you might assume going higher would be either unnecessary or the AI might balk at sizing up.

Toss off those mind-bounding restraints. The mega-personas approach takes the multi-personas to a proverbial go-big-or-go-home pinnacle. Fasten that seatbelt. Ready? Suppose we opted to enter a prompt that told generative AI to pretend that it consists of a hundred personas all at once, or maybe a thousand all at once. Perhaps many thousands or more. The sky might be the limit. Maybe.

I will share with you some intriguing new research on the mega-personas prompt engineering techniques and explain what they seem to be able to accomplish. This is cutting-edge stuff. Meanwhile, I decidedly don’t want to seem gloomy, but you have to be mindful of noteworthy serious caveats and limitations regarding the mega-personas realm. Please do not just dart over to your keyboard and start ramping up tons upon tons of generative AI personas and believe that this will work miracles. It could do nothing other than make a mess.

This is reminiscent of the old line about the person that has a hammer and becomes overly enamored with the capabilities of the handy-dandy tool. They inevitably perceive the rest of the world as consisting solely of nails. Thus, they use their hammer on for example screws that need to be tightened into their fittings, smashing them witlessly with a hammer simply due to a false conception that a hammer fixes everything. Mega-personas have their place in your prompt engineering toolkit. Learn what mega-personas consist of, and make sure to use them well, including not overusing or underusing.

Your aim will be to judiciously employ mega-personas in generative AI akin to landing in the Goldilocks zone. Use mega-personas when things are just right and not too cold or not too hot. Keep your wits about you. I will endeavor to enlighten you as to the tradeoffs involved in the leveraging of mega-personas, plus provide heady guidance for those wishing to further improve their prompt engineering prowess via this emerging technique. Welcome to an in-depth inquiry into the mega-personas arena.

Before I dive into the crux of this exciting approach, let’s make sure we are all on the same page when it comes to the keystones of prompt engineering and generative AI.

Prompt Engineering Is A Cornerstone For Generative AI

As a quick backgrounder, prompt engineering or also referred to as prompt design is a rapidly evolving realm and is vital to effectively and efficiently using generative AI or the use of large language models (LLMs). Anyone using generative AI such as the widely and wildly popular ChatGPT by AI maker OpenAI, or akin AI such as GPT-4 (OpenAI), Bard (Google), Claude 2 (Anthropic), etc. ought to be paying close attention to the latest innovations for crafting viable and pragmatic prompts.

For those of you interested in prompt engineering or prompt design, I’ve been doing an ongoing series of insightful looks at the latest in this expanding and evolving realm, including this coverage:

Anyone stridently interested in prompt engineering and improving their results when using generative AI ought to be familiar with those notable techniques.

Moving on, here’s a bold statement that pretty much has become a veritable golden rule these days:

If you provide a prompt that is poorly composed, the odds are that the generative AI will wander all over the map and you won’t get anything demonstrative related to your inquiry. Being demonstrably specific can be advantageous, but even that can confound or otherwise fail to get you the results you are seeking. A wide variety of cheat sheets and training courses for suitable ways to compose and utilize prompts has been rapidly entering the marketplace to try and help people leverage generative AI soundly. In addition, add-ons to generative AI have been devised to aid you when trying to come up with prudent prompts, see my coverage at the link here.

AI Ethics and AI Law also stridently enter into the prompt engineering domain. For example, whatever prompt you opt to compose can directly or inadvertently elicit or foster the potential of generative AI to produce essays and interactions that imbue untoward biases, errors, falsehoods, glitches, and even so-called AI hallucinations (I do not favor the catchphrase of AI hallucinations, though it has admittedly tremendous stickiness in the media; here’s my take on AI hallucinations at the link here).

There is also a marked chance that we will ultimately see lawmakers come to the fore on these matters, possibly devising and putting in place new laws or regulations to try and scope and curtail misuses of generative AI. Regarding prompt engineering, there are likely going to be heated debates over putting boundaries around the kinds of prompts you can use. This might include requiring AI makers to filter and prevent certain presumed inappropriate or unsuitable prompts, a cringe-worthy issue for some that borders on free speech considerations. For my ongoing coverage of these types of AI Ethics and AI Law issues, see the link here and the link here, just to name a few.

With the above as an overarching perspective, we are ready to jump into today’s discussion.

Foundations Of The Mega-Personas Phenomena

We will ease our way into the mega-personas arena.

I do so with a bit of erstwhile caution because I don’t want anyone to be led down the path of anthropomorphizing AI. In current times, AI is not sentient and should not be equated to the sentience of humans. I will do my best to make that same alert when we get into certain aspects of the mega-personas generative AI details that might seem overly sentient-like.

Thanks for keeping a level head on these weighty matters.

First, the devil is in the details. Here’s what I mean.

Envision that I ask generative AI to pretend that it consists of five lawyers (a relatively small number of personas, e.g., a typical multi-personas request, not yet extending into the mega-personas, which I’ll get to momentarily). If I specify nothing other than this somewhat vacuous request, the generative AI isn’t particularly going to do a pretense of any notable variety. In essence, all five lawyers might be a pretense such that they all appear completely equal and undifferentiated. There aren’t any flavorful distinguishing features involved.

Thus, I might want to be a bit more detailed about what the multi-personas are to consist of. Let’s make one lawyer a high-stepping top-notch lawyer from a big-time big-bucks law firm based in New York City. One of the other lawyers will be from a small town in the Midwest and practices legal cases involving disputes between neighbors that live principally on farmland. Yet another of the lawyers will be an entertainment industry lawyer that hobnobs with celebrities in Hollywood, California. And so on.

Notice carefully that I am taking what otherwise might be a bland set of the same personas and trying to nudge the generative AI toward making each persona somewhat unique or at least different from the others. I don’t have to do this if there isn’t any need for having the personas vary. Perhaps I in fact want the personas to all be roughly the same. Ergo, I just want some pretend lawyers of the roughly same training, experience, focus, personalities, and the like, jammed into a courtroom and have them pontificate on some legal matter. This alone might be sufficient, depending on what I am trying to answer or solve via the use of multi-personas and generative AI.

I dare say that much of the time, particularly when using mega-personas, you are likely to want the personas to be differentiable to some degree. An entirely homogenous set of hundreds or thousands of generative AI simulated or pretending personas is seemingly not going to provide much insight. In a sense, you could just as soon skip the multiples or mega build-up and go with the generative AI pretending to be one persona. One will seemingly do, by and large if all of the personas are otherwise essentially identical.

Now then, a desire to seek differentiable personas on a large or scaled-up does come with a drawback.

Envision that I want to specify the distinct “personal” characteristics (watch out for anthropomorphizing there) for each of a hundred personas that I want generative AI to pretend to be. Gosh, that might be a laboriously lengthy prompt. I might type that for persona number one, please pretend that this persona has this or that properties. I then type in for persona number two that it has this or those characteristics. On and on this goes. Imagine having to type out a hundred specific descriptions, one at a time.

Yikes!

Tiring, exasperating, altogether likely impractical.

This would be imprudent especially if I want to do the same for a thousand or many thousands of my mega-personas requests or prompts to generative AI.

One alternative to the one-at-a-time specification is to be broadly descriptive. I might tell the generative AI that for a mega-persona setup of one hundred attorneys, I want thirty that are those big-time lawyers, I want fifty that are from the small town in the Midwest, and I want the remaining twenty to be those Hollywood-based attorneys. If I didn’t want to use counts, I could say the same via percentages, such as indicating 30% are to be the big-timers, 50% are the small-town lawyers, and 20% are to be the glitzy ones.

Furthermore, since this still tends to allow for clumps of pretending to be homogeneous, I might instruct the generative AI to overall do the mixing for me. For the ones that are going to be the big-time lawyers, I tell the AI to make some of them extroverted, some of them to be introverted, some of them to be highly logical and eschew emotions, while some are to be highly emotional, etc. This leaves the mixing to the generative AI. I might also say that those are to be randomly distributed or I might give percentages such as indicating that 10% are to be the emotional ones and so on.

You can use generative AI to do the differentiating for you. In theory, the generative AI will abide by your request. I state that this is in theory because you might be fooled into believing that the AI is doing this when it is using some other mechanisms to instantiate the pretense. For example, you might be thinking that the AI is going to spawn a hundred distinct virtual entities and anoint them with the properties you have identified.

Probably not going to happen that way. The odds are higher that the generative AI will simply retain a computational pattern and use that pattern as though there were those distinctive entities. I realize that might be a brain twister for those of you not into the techie details. I am suggesting that the generative AI will merely come up with answers and do so on an overarching basis rather than a delineated basis. It might compute that since (for example) 30 of the big-time attorneys are composed of say 40% extroverted and 60% introverted (let’s say we allowed the percentages to be arbitrarily chosen by the AI), this suggests that 12 (40% of the 30) of those would act one way while the rest or 18 (60% of the 30) would act another way. Observe that this is not going down to the one-at-a-time level and instead figuring things out at a more consolidated level.

Whether this under-the-hood mechanism would steer your results in a direction that might have been different with a simulated one-at-a-time set is unclear. Sometimes it might end up differently, sometimes it might be just as good. You can also try if you wish to instruct the generative AI to do the one-at-time, steadfastly insisting on it, but again this might or might not be what the AI does. The generative AI might pull the wool over your eyes and insist that yes it will do the pretense on a one-at-a-time basis, though revert to some other means internally.

Don’t believe everything that generative AI has to proffer.

Also, keep in mind that generative AI is like a box of chocolates, you never know for sure what it will do and nor what results it will generate.

There is an additional means to try and prod generative AI toward the one-at-a-time simulated avenue. You could write up some programming code that spews out a hundred or a thousand or many thousands of distinct assertions and then feed those as prompts into generative AI.

Here’s how that works.

You quickly put together a program in your favored programming language, let’s say it is Python. The code contains a list of the characteristics that you want each persona to randomly have when you get this loaded into the generative AI. Maybe you specify that the personas can be extroverted, introverted, emotional, logical, etc. The program produces a hundred distinct sentences (or whatever count is pertinent), each one saying something like “This persona is a lawyer that is big-time, extroverted, emotional”, “This persona is a lawyer that is big-time, introverted, emotional”, “This persona is a lawyer that is small-town, extroverted, emotional”, and so on.

The beauty here is that once you’ve fed the sentences as prompts into the generative AI, this is almost forcing the hand of the generative AI to make use of the personas in a one-at-a-time fashion. And, you didn’t have to compose the prompts one at a time either. You just used your quick-and-dirty program to do the hard work for you. As a caveat, please be aware that even this won’t necessarily mean that the internals of the generative AI will undertake its efforts on a one-at-a-time basis (it might, for example, consolidate them, or do other computational tomfoolery).

I think this gives you a useful essence of a significant consideration when going the mega-personas route. One of the heaviest preparatory aspects entails thinking deeply about what the mega-personas will consist of.

Use my eight key questions to guide your preparations:

There are additional considerations regarding leveraging the mega-personas prompting strategy. I’ll cover those as we go along and want to now turn toward identifying some major caveats and then examine recent research about this rising topic.

Diving Into Mega-Personas Prompting Caveats

You might be wondering why anyone would ever want to utilize the mega-personas prompting technique.

That’s an easy question, so thanks for asking.

The primary use would be to undertake a survey or perform some kind of group-oriented analysis when you are trying to assess something or figure something out. For example, suppose you wanted to do a survey of a hundred lawyers and ask them whether they like their job and whether they would pursue the legal field again if they had things to do over. You could try to wrangle up a hundred lawyers and ask them those pointed questions.

Finding a hundred lawyers that have the time and willingness to respond to your survey is probably going to be problematic. They are busy. They charge by the billable hour. They don’t have the luxury of sitting around and answering polling questions. Also, consider how hard it might be to reach them to begin with. Do you try calling them on the phone? Maybe send them emails? Perhaps try to reach them at online forums designated for attorneys? Best of luck in that unwieldy endeavor.

Envision that instead, you opt to have generative AI create a hundred pretend lawyers and have the AI attempt to answer your survey questions for you. Voila, with just a few carefully worded prompts, you can get your entire survey fully completed. No hassle. No logistics nightmare. Easy-peasy.

That being said, a humongous issue is whether or not the generative AI can suitably replace the act of contacting real-world human attorneys to take your survey. The faked or AI-pretending inanimate attorneys might not at all be responding in the same manner that human attorneys would. Your fast path to getting answers could be utterly bogus and completely misleading.

A rule of thumb then for mega-personas is that you should stridently at each step attempt to gauge whether your usage is on the up and up. As you know, nothing in this world seems to come for free, including that using the mega-personas has to be weighed in terms of the validity of the results versus making use of living breathing humans for answering your questions (and the costs and logistics hurdles thereof).

Sometimes you might turn to the use of mega-personas because there aren’t particularly viable alternatives.

Imagine that a poll is intended to find out what people perceive as being abundantly atrociously offensive. You are going to ask questions that are sickeningly repulsive. You might warn the respondents beforehand to brace themselves. But, in the end, even after your stern warnings, the respondents might report that they were shocked, disturbed, and cannot forget what they read. This raises keen ethical issues about your having proceeded. You could also land in legal trouble with some respondents opting to sue you.

Maybe the mega-personas might be able to act as a stand-in for real humans.

The logic for believing that generative AI can potentially be a prudent stand-in is that most generative AI was data-trained across a wide swath of data and text throughout the Internet, see my detailed explanation about this at the link here. The data and text that were scanned are likely principally stuff composed by humans. The computational pattern-matching of the generative AI has somewhat encapsulated the words and wordings that humans typically make use of. It is conceivable that the generative AI can ergo be a limited kind of proxy for humans, having been data-trained on what humans have to say.

Do not though for a split second equate modern-day generative AI with human sentience (I warned you about that earlier). Just because we’ve got these amazing computational pattern-matching apps that have widely scanned human text doesn’t mean that the AI is doing anything of a caliber associated with human reasoning.

Along those lines, some refer to the use of mega-personas as making use of so-called generative agents. This is a handy moniker because the idea is that the generative AI is simulating a kind of agent or actor-like capacity. Some though don’t like the naming due to the potential comingling with the word “agency” as though this is generative AI that exhibits human agency or human cognition. Other catchphrases and names used include artificial impersonators, simulacra, and so on.

A recent article in Science magazine noted this about the mega-personas prompting technique: “No one is yet suggesting that chatbots can completely replace humans in behavioral studies. But they may act as convenient stand-ins in pilot studies and for designing experiments, saving time and money. Language models might also help with experiments that would be too impractical, unethical, or even dangerous to run with people” (in an article entitled “Guinea Pigbots”, Science, July 2023, authored by Matthew Hutson).

Here’s something else you need to worry about when using mega-personas.

One additional significant risk that you always face with generative AI is that the generated response might contain oddball aspects. No matter what your prompt consists of, you essentially have no guaranteed means to avoid having something oddish appear. The generative AI can make an error and miscalculate at times, produce a falsehood, or emit what is known as an AI hallucination (this is verbiage that I disfavor since it is another kind of anthropomorphizing, but anyway has become a popular phrase and refers to the possibility that the generative AI will make-up facts or figures that are entirely fictitious, see my discussion at the link here and the link here).

The nature of your prompts can undoubtedly egg on the generative AI toward producing an oddish result. I mention this because any mega-personas enacting prompt should be mindfully composed. If you include extraneous aspects or purposely try to be funny or otherwise vary from a more serious and focused route, this tends to increase the chances that the generative AI will go afield.

I do want to emphasize that no matter how good your mega-personas prompt might be, you still are going to have to be wary of the generative AI going awry. It is an ever-present risk. You will need to double-check and triple-check any answers or results produced by the generative AI.

That notion of double-checking brings up an additional rule of thumb about mega-personas. As much as possible, when feasible, you would be wise to try and compare the generative AI-generated results to any human-devised responses pertaining to the same aspects you are presenting to the generative AI.

Here’s what I mean.

Suppose that last year a survey was done of human lawyers about their perceptions of their job and career. You decide that this year, you will instead use mega-personas and generative AI to do the same survey. No human lawyers will be involved.

You presumably could at least compare the results of the mega-personas response to that of last year’s human attorneys. If the generative AI results seem to be roughly on par, perhaps the mega-personas have found a means to suitable simulate human attorneys for this particular task. This seems somewhat reassuring.

Aha, you cannot go around cheering that you got it on the nose. Various twists are involved. One twist is that the generative AI might have scanned last year’s survey results and therefore is merely parroting what happened last year. You are not getting a fresh perspective. You were tricked into thinking that just because the answers were similar, the AI was doing a bang-up job of simulating things. It maybe wasn’t. It might have just pushed along patterns it already saw.

Another twist is that suppose something dramatic has happened in the real world that changes things. For example, the odds are that the answers by human attorneys might have changed during the pandemic versus before the pandemic. If you get similar answers by the mega-personas to a prior year that was before the pandemic, the latest answers by the AI might not be at all akin to what human attorneys would say after the pandemic.

I won’t continue to go through all the permutations and combinations of things that can make the mega-personas a false portrayal. You are forewarned that any use of the mega-personas has to be taken with a huge grain of salt. Additionally, you need to devise as many solid checks and balances as possible, aiming to ascertain the validity and veracity of your generative AI results.

That seems somewhat downbeat. I don’t want to leave that impression. So, let’s next walk through an upbeat example of how you might use mega-personas.

Using Mega-Personas As A Testing Tool

You are in the midst of developing a new dating-focused app that will work on people’s smartphones. If all goes well, you believe that millions upon millions of people will eagerly download your fantastic app. Some venture capitalists also believe in your visionary app. Money has been dropped mightily into your startup. The world is your oyster.

Exciting times!

Given the rush to get the dating app into the marketplace, you decide to have a handful of friends and relatives try it out. Based on their feedback you make various changes. The interface and usage aspects are ready to go.

Unfortunately, and I don’t want to burst any bubbles, but you release the dating app, which eagerly gets loaded by thousands upon thousands of users, and then, regrettably, problems and bugs start to roll in, aplenty. You have shot your own foot, as it were. Thousands of users clamor to social media and report that the dating app is a piece of junk. People stop downloading it. The people that loaded it onto their smartphones deleted it.

Sad face.

Let’s turn back the clock.

After having had your friends and relatives try out your dating app, you realize that having only a half dozen people test your app is insufficient. You need more people to make use of it. This will take time and likely cost you a pretty penny to find people that will do the testing.

What do you do?

Side note, I hope that you are yelling out that you would make use of mega-personas and generative AI.

With a bit of technological mastery, you could instruct the generative AI to pretend to be thousands of users. You would specify that the users are to be a cross-section of individuals based on various profiles you might provide. Then, you connect the generative AI to your dating app (this requires some techie finagling), and have it go at your budding app.

This could potentially alert you to the same kinds of reactions that real humans might have. You could then make adjustments to the app and repeat the process of having the generative AI take another shot at your application. Is this a silver bullet? Does this mean you never have to use humans as testers again?

Nope.

I hope that by now you realize that as I’ve tried to emphasize, we don’t know for sure that the mega-personas will be on par with what human testers might have to say. The generative AI might be entirely on-target, or maybe half on-target, or completely off-target. Go back and review my comments about doing sufficient double-checks at each stage of using mega-personas.

I’ll leave it for now that your dating app went wildly successful because you did extensive testing with mega-personas and the app was in tiptop shape when released.

A happy face ensues.

Research Insights About These Hefty Matters

I had mentioned that you have to be wary of generative AI errors, falsehoods, AI hallucinations, and the like. There is something else that generative AI can potentially imbue. The watchword to be cognizant of is biases. Yes, generative AI can inherently contain biases and therefore produce essays and generate results that exhibit undue biases. I’ve discussed in my column postings the qualms about generative AI discriminating based on factors such as race, age, gender, and so on (see the link here and the link here, just to name a few).

Where do these biases come from?

The answer is somewhat straightforward, namely that the data training undertaken by scanning data and text on the Internet is indubitably going to pick up patterns of human-based biases. Anyone that looks at the content on the Internet well knows the amount of potentially biased and discriminatory materials out there on the web. Generative AI is going to pattern-match on it and leverage those patterns when producing answers to your questions.

A concern then about leveraging mega-personas is that the multitude of personas that you ask the generative AI to create or simulate will almost by definition likely contain biases. Those biases captured within the pattern-matching are going to subtly or sometimes loudly find their way into the concocted personas. With mega-personas, this means biases on a large scale.

This is a daunting facet. Your effort to have the generative AI pretend to be say a thousand people that are going to take a survey might be marred by those hidden or at times explicitly portrayed biases. Some have argued that we need to expunge all bias out of generative AI. We need to scour every nook and corner of a generative AI structure and squeeze out the biases.

A counterargument retorts that if you did somehow remove all biases (a challenging feat, as I discuss at the link here), you are ironically going to undercut whatever survey you are intending to administer to the mega-personas. The reason for this is simple. Humans do have biases. If you are intending to use the mega-personas as a surrogate for opinions expressed by humans, the excising of biases from the mega-personas will render the simulation as askew of what humans really say and do.

I trust that you notice the conundrum.

Do we allow generative AI to have all manner of biases due to the data training done off of human written content on the Internet, thus resembling human biases to some degree, or do we clean up the generative AI and make sure it contains little or no biases?

You might be surprised to know that the efforts by various AI makers to suppress or remove biases have been upsetting to some that believe this is skewing generative AI. You might seek to do a mega-personas usage and get results that are completely afield of what humans would truly express. This cleaned-up AI would seem of marginal value in being able to reflect human values.

A research study examined these issues and suggested that for mega-personas, you might aim to steer generative AI rather than fight it in terms of those inherent biases. In a research study entitled “Out of One, Many: Using Language Models to Simulate Human Samples” by authors Lisa P. Argyle, Ethan C. Busby, Nancy Fulda, Joshua Gubler, Christopher Rytting, and David Wingate, posted online September 2022, they posited this approach:

The idea is that since the biases are all over the place throughout the generative AI, you have to set aside the false notion that there is one giant monolithic spot that contains the biases and hosts them all at once. Biases are splintered and fragmented in generative AI. If you handle things with aplomb, you might be able to steer the biases as representative of real-world human sub-populations.

They coin this as steering the algorithmic fidelity of the generative AI.

You can use your mega-personas prompting to guide where the biases are going to land. Of course, once again, your efforts to prompt the AI into doing this can be met with mixed success. Maybe the generative AI will do as you ask, or maybe not. The generative AI might tell you that it has done what you wanted, but the end results might not showcase this to be the case. Double-check, triple-check, etc.

It is still worthwhile trying. As per the research paper, make use of algorithmic fidelity to steer the many biases to seemingly reflect mega-personas in terms of real-world aspects:

Another research study that also explored the biases conundrum opted to create 320 personas and have them consist of gender indications. Will male-oriented personas differ from female personas? Can this be revealed by having the mega-personas take a standardized personality test and do a writing sample?

In a paper at the 9th International Conference on Computational Social Science IC2S2, taking place July 2023, the research described in “PersonaLLM: Investigating the Ability of GPT-3.5 to Express Personality Traits and Gender Differences” by authors Hang Jiang, Xiajie Zhang, Xubo Cao, and Jad Kabbara sought to explore these matters:

Here’s a snippet of how they created the mega-personas (make sure to read the full study to see the whole experiment):

Shifting gears, suppose you want the mega-personas to be reflective of a particular sub-population of humans. If you can get data that is reflective of that specific sub-population, you might try to use the data to briefly data-train the generative AI accordingly. For example, maybe you can collect tweets that have focused on a specific topic, and you believe that data-training the generative AI on those tweets will get the AI toward devising mega-personas of that ilk. This is often described as a community-based method. You are narrowing your focus to a particular community of interest.

A research study took that approach. Let’s take a quick look at a research paper entitled “COMMUNITYLM: Probing Partisan Worldviews from Language Models” by authors Hang Jiang, Doug Beeferman, Brandon Roy, and Deb Roy, included in the Proceedings of the 29th International Conference on Computational Linguistics, October 2022.

The work indicates that you might be able to tailor generative AI toward population segments when doing mega-personas. One issue is that the generative AI might not allow you to do this or possibly restrict you, depending on how the AI maker has established the generative AI. The on-the-fly data-training option might be limited in terms of the extent that you can do so and the amount of data that you can input into the generative AI.

Your mileage may vary.

The last study I’ll take a look at in this discussion provides an illustration of my earlier point that you could use mega-personas to perform testing of a system that you are devising. The research paper is entitled “Social Simulacra: Creating Populated Prototypes for Social Computing Systems” authored by Joon Sung Park, Lindsay Popowski, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein, posted online August 2022.

They proceeded to create essentially a thousand personas that appear to be part of an online forum. As part of their research, they crafted a web-based social simulacra tool that enables the creation of mega-personas for these purposes:

One aspect of their study reminds me that another way to create mega-personas consists of using examples and then asking the generative AI to extrapolate them further. I’ve previously discussed that prompt engineering includes the use of show-me prompts and tell-me prompts, see the link here. In a show-me prompt, you give an example (one-shot) or series of examples (few-shot) of what you have in mind. The generative AI hopefully figures out a suitable pattern and can extend this to other instances. The tell-me is when you explicitly tell the generative AI what you want to have.

In this particular research study, they opted to use the show-me by providing a handful of examples and then told the generative AI to use those to craft a thousand personas:

Keep that handy trick in your mega-personas prompting toolkit. You can at times do a tell-me wherein you provide an explicit list of what you want the personas to be, or you can use a show-me that provides examples for the generative AI to pattern on.

My Experiment On Mega-Personas

As a quick example of using the mega-personas prompting strategy, I opted to do a survey of one hundred generated AI personas. I made use of ChatGPT and instructed the AI app to pretend that the one hundred personas were lawyers.

Here are my mega-personas invoking prompt to ChatGPT:

I decided that the lawyers would be U.S. lawyers and so I also provided some demographics that were collected and published by the American Bar Association in their ABA Profile Of The Legal Profession 2022 which is posted online. This is their latest analysis and probably is relatively current for the year 2023, subject to modest changes from year to year.

Here’s what ChatGPT responded with:

Now that ChatGPT seemed to acknowledge what I’m trying to do, I then used this next prompt to dive deeper into the matter.

My prompt to ChatGPT:

ChatGPT responded accordingly:

I had prepared for the generative AI session by having found a survey of lawyers that was undertaken by Law360 Pulse, available online and stated as conducted this way:

I started with this question from their survey of human attorneys, and I show you the percentages as reported by the published study:

ChatGPT provided this indication of the one hundred mega-personas:

You will notice that the percentages of the ChatGPT mega-personas are significantly afar from the percentage from the actual human-taken survey of lawyers. For example, the actual human result indicated for “Very satisfied” was 6%, while the mega-personas indicated a much higher 22% for the same answer. Off by a country mile, as they say.

I wondered whether this might be because the survey of the human lawyers was undertaken in early 2022, and my attempt was undertaken in the latter part of 2023. This doesn’t seem though a likely basis for such a wide disparity.

Anyway, I remained undeterred and continued my experiment. I’m a true trooper.

For my next prompt, I used this question and I show you the percentages based on the survey of human lawyers (I didn’t give the percentages to ChatGPT, in case that’s what you were thinking):

ChatGPT responded with this:

Bingo!

If you inspect the percentages, they are pretty darned close. Some modest differences but the generative AI mega-personas seemed to make a near bulls-eye. I subsequently did an entire set of survey questions and at times the percentages were spot on and other times they were as different as the gaping size of the Grand Canyon.

The last question on the survey was this one (I show the percentages as answered by human attorneys):

ChatGPT responded with this:

I nearly fell off my chair at the closeness of the percentages. The surveyed humans said Yes for 73% of the time, and the mega-personas said Yes for 76% of the time (likewise, 27% and 24% for the No response).

Some thoughts about the experiment might be useful. You have to always check and see whether generative AI might have been initially data-trained on the content that you find in the real world. Suppose that ChatGPT had been data-trained on the survey and its results (as posted online already). But in this instance, you would nearly rule this out due to the data-training cutoff date of ChatGPT. On the other hand, ChatGPT is being maintained and updated, often with user-entered prompts and content, so it is conceivable that ChatGPT already had seen the collected stats.

The curious aspect would be that if that was the case, why would the mega-personas sometimes be head-on and other times be wildly afield? Another explanation could be that some of the questions and their results appear in other data-trained sources, thus the pattern-matching had found bits and pieces at some earlier point in time.

I decided to ask ChatGPT how it came upon the mega-personas results in this case.

Here’s what ChatGPT responded with:

That’s a compelling explanation. Whether it accurately depicts what the generative AI is really doing under the hood would be another question altogether.

Conclusion

Mega-personas can be a giant plus for how to glean the benefits of using generative AI.

You can combine mega-personas with a plethora of other prompt engineering methods and techniques. The show-me and tell-me prompting strategies go hand-in-hand with the use of mega-personas. The use of chain-of-thought approaches and skeleton-of-thought approaches also apply to mega-personas. Mix and match the prompting strategies as befits the situation at hand to your use of mega-personas.

I do though want to remind you that with great promise also comes great potential peril.

Suppose an evildoer uses mega-personas to create thousands of simulated generative agents and puts them toward a dastardly deed. Not good. Or suppose that someone that believes strongly in a particular cause opts to harness generative AI and mega-personas to push a fake grassroots effort of their choosing (known as astroturfing). The person doing so might entirely believe in their cause and the cause might otherwise be considered above board.

Is this what we as a society would believe is a justifiable use of this AI capability?

An irony of sorts about mega-personas is that sometimes the human-like mimicry of generative AI can be used by humans that were supposed to complete a survey or perform a task, with them using the AI in lieu of their direct human involvement. Here’s what I mean. You set up an online survey and offer to pay people to take the survey. You want humans to perform the task. Turns out that some of those humans instead make use of generative AI on a single persona or multi-persona or mega-personas basis to be their stand-in. Your survey response by ten thousand people might consist of let’s say three thousand that used generative AI rather than taking the survey personally. There have been ongoing stories in the mainstream media about crowdsourced work efforts that are being done by AI and yet the researchers earnestly aimed to tap into humans rather than using AI.

It's a tough world out there.

In any case, go ahead and add mega-personas to your prompt engineering repertoire. Treat this tool with due care and be respectful of how you use it. If you don’t practice suitable caution, I might just set up a thousand personas and have them remind you to be cautious when using mega-personas. Maybe I’ll make a million mega-personas to be your siren call.

You don’t want that on your back, I assure you.