Pointers & References
An brief explanation of Pointers & References
Pointers & References
You use a pointer when you want to pass an object by pointer, and you use a const reference in just about every other situation, unless the object is small enough you don't have to worry about passing it by value or you want to pass it by mutable reference.
So, let's talk about what that means.
In C++ (and really just about every other programming language) tracks what is going on with a data structure called The Stack. The Stack is managed by the program execution, and is probably the oldest case of managed memory in programming. The Stack is just a section of memory where we keep the current state of the executing code in.
As you rightfully noted, there are 3 ways to pass data into functions. These are By Value (no symbols), By Reference (with a & symbol), and By Pointer (with a * symbol).
So, to understand why you use of these function variable types, you must first understand how functions actually work. When you call a function, your program will place 8 bytes (the function address) onto The Stack. This keeps track of where you are in your program code. If your function has no parameters, then only those 8 bytes will be pushed onto The Stack and will free once your program is done.
However, if you have Parameters, then those Parameters are pushed onto The Stack as well (usually. not gonna go into inlining in this explanation). How much data is placed onto The Stack in this situation is determined by the size of those objects. Integers? 4 bytes. Chars? 1 byte. A FVector? 12 bytes (3 floats * 4 bytes per float = 12 bytes). Also, something to keep in mind, if you Pass By Value (i.e., no symbols, just Foo(Type Name);
), that data (or object) is Copied into the Stack. It's not moved, it's not referenced. It's full on duplicated. This is important as I will get into later.
Now here is where things get really interesting. If you pass an object by Reference (&) or Pointer (*)... You place 8 bytes into the stack. No matter the size of the underlying object. Why? You aren't actually putting the whole object onto The Stack, you are simply giving the address to that object. In (most) 64bit programs, an address is 8 bytes (64 bits). This ends up saving a lot of space when you are passing around larger objects! We obviously don't want to pass ints (4 bytes) or chars (1 byte) by reference or pointer very often, as that would increase the amount of size used... but that FVector is a really tasty candidate for saving some space on the stack. There are much larger objects (such as FStrings and TArrays) that make a lot of sense to pass by Reference as well.
Speaking of FStrings and TArrays... Remember when I said that passing by value copies the object? Yeah... so it copies TArrays and FStrings if you pass them by value. If you pass by Reference, you can prevent that copy as well! You are simply saying in code "Don't copy this object, instead use this object that was passed in directly!". This is an incredibly useful feature of C++ and lets you save a ton of time and memory when doing simple operations like passing objects into a function. Converting all the function parameters to const TArray<T>&
and strings to const FString&
increased performance by 25% in a larger game I worked on. Avoiding the Pass By Value copy operations is a big deal for larger and more complicated objects. Obviously, smaller and more trivial objects can be passed by value easily. For me, personally, anything larger than a FVector is being passed by const reference.
One thing to note is that by passing stuff by Reference lets you modify the caller's version of that data. If you call some Foo(int& Bar)
with like Foo(A)
, any changes in Foo to Bar will affect A. This is a really useful property of passing by reference, but there are times when you very much do not want that to happen. It may be prohibitively expensive to pass By Value, so what you can do to avoid that is adding the const
modifier to the parameter. Const means, at its core, "This cannot be modified". Where you put it matters in most cases, but this is about function parameters so I'm just going to talk about that. So, the function Foo(const int& Bar)
is saying "Bar cannot be modified, it's const", preventing any changes to Bar inside of Foo from propagating up to the caller. It would give you a compiler error if you attempted to modify Bar, which is what you want to prevent its modification.
Finally, why use References over Pointers? Simple: Pointers can be Null, and that is a useful property. Null means "This object doesn't exist", and a reference cannot be Null (it is undefined behavior to have a null reference). So, for objects where being Null is a valid state (like anything that can be garbage collected such as UObjects), it makes sense to pass things by Pointer rather than by Reference. When you pass a value By Pointer, much like By Reference only 8 bytes are pushed onto the stack and the object is not copied. The difference is that passing By Pointer is a explicit declaration that the value can be Null and you should absolutely check for that and handle the case where it is. If you pass By Reference, you are saying that it can never be Null.
Also, it's an error to pass UObjects By Value. They must be passed either By Pointer (do this 99% of the time) or By Reference (you better have a damn good reason to do this, but it is done in the engine). The reason is that Epic does some cool and very technically complicated things to UObjects that prevent the by value copy from working correctly.
So that is, in a rather simple nutshell, why you would use one form over another. They are all good and useful.
Pointers (*):
- Can be null.
- Have a memory address.
- Can be changed to point at something else than the previous pointed object.
- You can’t point to a reference.
References (&):
- Can reference a pointer or the object in memory itself.
- Have no address, thus you can’t assign a pointer to it.
- Once you declare it, you can’t change it to reference to something else.
Additional Context
Pointers (*) and References (&). Man, I find it hard to use these. What exactly are they? Can you include some examples on how to use and where to use and what's the advantage of using it?
This is perhaps the most important thing for newcomers to understand to really get rolling with C++.
Once you have a firm grasp on Pointers you should be coding happily in UE4 C++ !
Pointers are extremely powerful and also a bit dangerous in order to give you the power that they have.
Pointers require you to be a diligent coder. In return they give you speed and power.
A pointer must point to a memory address where the actual data is stored.
To get the memory address for the pointer to point to, you use &
FVector Location = FVector(1, 2, 9000); FVector* LocationPtr = nullptr; //LocationPtr currently points to NOTHING LocationPtr = &Location; //LocationPtr now points to the memory address of Location
Always Check Your Pointers
Before trying to access the data that pointers are supposed to be pointing to, you must always check if they actually are!
check(LocationPtr);
or
if(!LocationPtr) return;
You have to do this because at any time you do not know if the pointer still points to valid data.
Using check will crash the game deliberately if the pointer is not valid, you would do this if you consider it absolutely critical that a pointer never be invalid at a particular point in your code.
In the case of returning out of the function, this could be a "silent" fail case that you would not detect unless you print out a screen or log message as you exit.
The check method guarantees you will know what happened, which is that a pointer that absolutely must be valid wasn't.
De-Referencing Pointers
After you've verified the pointer does point to valid data, you can dereference it to access the data
De-Reference with *
FVector NewVector = FVector::ZeroVector; if(LocationPtr) { NewVector = *LocationPtr; //De-Referencing pointer }
De-Reference with ->
if(!LocationPtr) return; const float XValue = LocationPtr->X;
Why Use a Pointer?
-
Pointers give you direct access to memory locations that might be far, far away from your local context in which you are running code.
-
Pointers also give you a way to access huge amounts of data without creating a copy of that data
-
Pointers give you a living connection, a dynamically updated link to data that will always be current because the pointer is an indirect reference that does not have to change itself to update along with the data it is pointing to.
Access Data That is Far, Far Away
Let's say you have a Character, that is part of a sublevel, that is part of a sub world, that is part of a far away Galaxy,
and to actually finally reach this Character's current Armor variable, you have to travel through a whole series of Gets, like this:
GetGalaxy()->GetSolarSystem()->GetPlanet()->GetMainCharacter()->Armor;
The above operation is costly and takes time to execute, it is also complicated to read in code.
The above code is however, a way to access the current armor of a Character in a Galaxy far far away.
But wouldn't it be nice to be able to do this complicated retrieval operation only once, and obtain a sort of link, some kind of a pointer to the actual data, for quick access?
FArmorStruct* TheArmor = & GetGalaxy()->GetSolarSystem()->GetPlanet()->GetMainCharacter()->Armor;
Now, to obtain data about the armor, you dont have to write
GetGalaxy()->GetSolarSystem()->GetPlanet()->GetMainCharacter()->GetCurrentArmor().Durability; GetGalaxy()->GetSolarSystem()->GetPlanet()->GetMainCharacter()->GetCurrentArmor().Color; GetGalaxy()->GetSolarSystem()->GetPlanet()->GetMainCharacter()->GetCurrentArmor().Size;
which is both confusing to read, and does actually incur a runtime cost on the CPU to do all those get operations to get across the Galaxy to the Character's armor.
Instead you can write:
//Always Check Your Pointers if(!TheArmor) return; TheArmor->Durability; TheArmor->Color; TheArmor->Size;
See? Isn't that easier to read?
But wait there's more!
Access Huge Quantities of Data Any Time
Pointers can point to relatively small amounts of data, or vast and huge quantities of data!
The pointer is pointing to a location in memory,
the actual amount of memory involved could be enormous!
So continuing the above example, you might say, "Why don't I just create a copy of the Character's Armor?"
FArmorStruct ArmorVar = GetGalaxy()->GetSolarSystem()->GetPlanet()->GetMainCharacter()->Armor;
But what if the FArmorStruct contains a huuuuge amount of data, and you want to do this for 300 Characters across the Galaxy?
You would be copying this data many times over!
But the data already exists in one memory location.
Why would you copy it over just to access it, thereby duplicating the data?
Pointers enable you to avoid this entirely!
You can simply point to the one memory location and access it any time.
Stay Current With Runtime Changes
Also, in the above case, once you copy the armor data, it is no longer the actual armor of the Character on the other world.
You have lost the living connection to the Armor.
So if the Character on the other world chooses to change their armor color, you won't know!
Pointers give you an actual living, dynamically updated link
to potentially large quantities of data,
that you only want to have to obtain access to once.
So again, the correct way:
FArmorStruct* TheArmor = & GetGalaxy()->GetSolarSystem()->GetPlanet()->GetMainCharacter()->Armor;
Now you have easy access and a living connection to data that is in a Galaxy far far away,
without ever having to duplicate any data.
Passing Data by Reference
Another use of & is to pass data by reference into functions.
This is especially important for very large quantities of data that you would not want to copy into the function context!
int32 AMyClass::GetArraySize(TArray& MyHugeBinaryArray) const { return MyHugeBinaryArray.Num(); //you use . instead of-> because passing by Reference }
If you dont use the &, then you will be copying the entire huge array, just to find out how big it is!
Try to pass by reference wherever you can, instead of passing in pointers. It is much safer :)