Discussion:
[sage-devel] Weak references in the coercion model
Jeroen Demeyer
2018-12-03 10:46:49 UTC
Permalink
I am studying the coercion model in detail, looking for optimization
opportunities. One source of slow-down is the use of weak references.

Over time, more and more places in Sage use weak references. But I'd
like to look at the big picture and see where weak references should be
used and where not.

Take coercion maps for example. The coercion model stores a weak
reference to the coercion maps and the maps also store a weak reference
to the domain (but not the codomain).

It's not clear to me why this double weak reference is needed, but maybe
I'm missing something. It seems more logical to use strong references in
the coercion map but then store a weak reference to the map.

I'd like to "fix" this with action maps first, which are conceptually
simpler.


Jeroen.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jean-Pierre Flori
2018-12-03 11:44:04 UTC
Permalink
You should surely be able to extract some info in this ticket where we
fought hard memory leaks:
https://trac.sagemath.org/ticket/715
Maybe comment 75 though I did not really go through the whole ticket:
https://trac.sagemath.org/ticket/715#comment:75
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Nils Bruin
2018-12-03 16:21:52 UTC
Permalink
Post by Jeroen Demeyer
It's not clear to me why this double weak reference is needed, but maybe
I'm missing something. It seems more logical to use strong references in
the coercion map but then store a weak reference to the map.
The weak reference for the domain on coercion maps was introduced to
accommodate coercion maps into longer-lived codomains. Coercion maps are
generally cached on the codomain, because generally the codomain is
constructed from the domain and hence shorter lived (the codomain generally
references the domain anyway). This is not the case when mapping number
fields into, say Qbar or RR. Then it may well be the case that the domain
needs to be collected before the codomain. In order to make that possible,
the coercion map (referenced strongly on the codomain -- it needs to be
strongly referenced somewhere to keep it alive) must not hold a strong ref
to the domain.
Post by Jeroen Demeyer
I'd like to "fix" this with action maps first, which are conceptually
simpler.
Jeroen.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jeroen Demeyer
2018-12-04 09:50:14 UTC
Permalink
Post by Nils Bruin
In order to
make that possible, the coercion map (referenced strongly on the
codomain -- it needs to be strongly referenced somewhere to keep it
alive) must not hold a strong ref to the domain.
I wonder if there is a way to somehow reference an object from a pair of
objects: have A and B reference C in such a way that, if either A or B
get deallocated, also C gets deallocated.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Nils Bruin
2018-12-04 17:06:25 UTC
Permalink
Post by Jeroen Demeyer
Post by Nils Bruin
In order to
make that possible, the coercion map (referenced strongly on the
codomain -- it needs to be strongly referenced somewhere to keep it
alive) must not hold a strong ref to the domain.
I wonder if there is a way to somehow reference an object from a pair of
objects: have A and B reference C in such a way that, if either A or B
get deallocated, also C gets deallocated.
Tripledict does that to some extent (with its keys): if one of the key
parts gets deallocated, the weakref callback removes the strong reference
to the value. Note that you will never ensure that C gets deallocated: if
someone else is keeping a ref to C, it should be kept alive. The best you
can hope for is a structure where the existence of both A and B ensures the
continued existence of C, but as soon as one of A or B goes, then the
existence of C is no longer assured.

All the leaks I've seen in settings like this come from loops where A,B,C
have relations as above, but somehow C ends up anchoring a reference to A
and B. That's the big loophole with globally rooted references that are
guarded by a weak reference callback: what would normally be cycles in
garbage now are suddenly globally anchored data structures not eligible for
collection. As we've seen again and again, that's a very difficult paradigm
to program correctly with.

Note that the weak ref to the domain on the "coercion maps" shouldn't be a
performance issue. It's just there to ensure that it's straightforward to
turn it into a full-blown map. In situations where the map is recovered
from the coercion system, it's done by looking up via domain and codomain.
No need to look it up on the map again. So a micro-optimization would be to
short-cut some evaluation steps, if you think the domain is presently being
actively recovered via the weakref.

See https://trac.sagemath.org/ticket/14711 for very detailed discussions
about the how and why of weakly referenced domains in the coercion system.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jeroen Demeyer
2018-12-04 20:44:13 UTC
Permalink
Post by Nils Bruin
Tripledict does that to some extent (with its keys): if one of the key
parts gets deallocated, the weakref callback removes the strong
reference to the value.
Yes, but then we potentially end up again in the situation where things
are *only* weakly referenced. Currently, you still need a strong
reference in a fixed place and ideally we shouldn't.

I have a very preliminary idea at #26811 to "fix" this.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
s***@gmail.com
2018-12-05 02:06:39 UTC
Permalink
Would it be advisable to change the base programming language to one that does automatic garbage collection instead of having to check to see if a class has been properly disposed like it appears from all of these related bugs?

Sent from my iPhone
Post by Nils Bruin
Tripledict does that to some extent (with its keys): if one of the key
parts gets deallocated, the weakref callback removes the strong
reference to the value.
Yes, but then we potentially end up again in the situation where things are *only* weakly referenced. Currently, you still need a strong reference in a fixed place and ideally we shouldn't.
I have a very preliminary idea at #26811 to "fix" this.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
David Roe
2018-12-05 02:54:22 UTC
Permalink
Post by s***@gmail.com
Would it be advisable to change the base programming language to one that
does automatic garbage collection instead of having to check to see if a
class has been properly disposed like it appears from all of these related
bugs?
I can't tell if you're being sarcastic, but I'll just say that the problem
isn't the language. Both Python and Cython have garbage collection. The
issue is that we want to cache things for speed reasons, but also need to
prevent unbounded growth in Sage's memory usage.
David
Post by s***@gmail.com
Sent from my iPhone
Post by Jeroen Demeyer
Post by Nils Bruin
Tripledict does that to some extent (with its keys): if one of the key
parts gets deallocated, the weakref callback removes the strong
reference to the value.
Yes, but then we potentially end up again in the situation where things
are *only* weakly referenced. Currently, you still need a strong reference
in a fixed place and ideally we shouldn't.
Post by Jeroen Demeyer
I have a very preliminary idea at #26811 to "fix" this.
--
You received this message because you are subscribed to the Google
Groups "sage-devel" group.
Post by Jeroen Demeyer
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
s***@gmail.com
2018-12-05 06:07:33 UTC
Permalink
No sarcasm intended. I used to work at a software development firm and we were forced to use c++. Most of the back and forth that I have been observing with the team is around memory leaks and I just wondered if that is due to your development language choice. Hence, my comment. However, if you are using low level languages to get the fastest processing time, then I can see why you are doing what you are doing. I’m just afraid that there may be some low level design flaw leading to all of these leaks. For instance you might consider using messages between different objects to help isolate object disposal, instead of carefully remembering what object has to be disposed in which order. This is difficult to do because of the late binding due to the flexibility of sage.

Sent from my iPhone
Post by s***@gmail.com
Would it be advisable to change the base programming language to one that does automatic garbage collection instead of having to check to see if a class has been properly disposed like it appears from all of these related bugs?
I can't tell if you're being sarcastic, but I'll just say that the problem isn't the language. Both Python and Cython have garbage collection. The issue is that we want to cache things for speed reasons, but also need to prevent unbounded growth in Sage's memory usage.
David
Post by s***@gmail.com
Sent from my iPhone
Post by Nils Bruin
Tripledict does that to some extent (with its keys): if one of the key
parts gets deallocated, the weakref callback removes the strong
reference to the value.
Yes, but then we potentially end up again in the situation where things are *only* weakly referenced. Currently, you still need a strong reference in a fixed place and ideally we shouldn't.
I have a very preliminary idea at #26811 to "fix" this.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Steven Craighead
2018-12-05 06:32:26 UTC
Permalink
This post might be inappropriate. Click to display it.
Simon King
2018-12-05 11:50:33 UTC
Permalink
Hi Steven,
Post by Steven Craighead
How difficult is it to create a stack that can control the order of objects
being created and destroyed so you prevent leaks? Can you add a new method
on your base class that is inherited to all children to track this?
One basic principle in the development of Sage is: "Do not reinvent the
wheel, but build the car."

So, we use mainstream programming languages (Python for user
interface, Python and Cython for the Sage library), and we use
existing software packages, partly from the Python ecosystem,
partly stand-alone, such as Singular. These third-party packages can of
course be written in all kinds of languages, such as fortran, C++, Java,
etc.

Therefore, memory leaks could have three causes:

1. A leak in a third-party package. In that case, we send a bug report to
upstream, possibly fixing it by a downstream patch until the bug is
fixed upstream.
2. A bug in Python's cyclic garbage collection. I don't know if we ever
stumbled over such bug, but in principle it is a possible cause.
3. Constructions in the Sage library that prevent Python's cyclic
garbage collection from properly working.

This thread is about yet another instance of cause 3, or rather about a
strategy to generally prevent instances of cause 3.

What prevents cyclic gc from working?
- If there is a reference cycle involving one instance with a __del__
method, then Python would not apply garbage collection to that cycle.
If I recall correctly, there has been a Sage memory leak in the past where
the existence of a __del__ method was the underlying cause.
- For performance reason, Sage does a lot of caching, and the caching
involves objects that typically have lots of cross references. See,
for example, the coercion system. Note that coercion in Sage is not
the same as coercion in C. A coercion in Sage, in first approximation,
is a canonical morphism in a mathematical category. If caching is done
improperly, then there may be some external reference to a reference
cycle, which of course means that Python wouldn't (and shouldn't)
garbage collect that cycle.

In order to prevent external strong references to a reference cycle, we
often use weak references for caching.
But that needs to be done with care: If there are two many weak
references, then stuff may be garbage collected although we wanted it to
be cached, and also following a weak references is slower than following
a strong reference; but using too few weak references creates a memory
leak.

I believe it would be an extremely bad idea to try and implement our own
cyclic garbage collection, disabling Python's.

Best regards,
Simon
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Volker Braun
2018-12-05 12:38:03 UTC
Permalink
Post by Simon King
- If there is a reference cycle involving one instance with a __del__
method, then Python would not apply garbage collection to that cycle.
Python will garbage collect the cycle by not calling __del__ on some of the
cycle members.

The fun source of the bugs that your probably remember is that Cython
__dealloc__ is always called, even in cycles, but on partly-deconstructed
Python objects...
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Simon King
2018-12-05 13:57:14 UTC
Permalink
Post by Volker Braun
Post by Simon King
- If there is a reference cycle involving one instance with a __del__
method, then Python would not apply garbage collection to that cycle.
Python will garbage collect the cycle by not calling __del__ on some of
the cycle members.
Has that changed? I recall that in my early days in Sage (when I created
the first version of my group cohomology package) I had to remove some
__del__ method (or change it into a __dealloc__ method), since otherwise
some reference cycles haven't been collected at all.

Best regards,
Simon
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
E. Madison Bray
2018-12-05 14:06:26 UTC
Permalink
Post by Simon King
- If there is a reference cycle involving one instance with a __del__
method, then Python would not apply garbage collection to that cycle.
Python will garbage collect the cycle by not calling __del__ on some of the cycle members.
Has that changed? I recall that in my early days in Sage (when I created the first version of my group cohomology package) I had to remove some __del__ method (or change it into a __dealloc__ method), since otherwise some reference cycles haven't been collected at all.
There have been some changes in this area, but mostly focus on Python
3: https://www.python.org/dev/peps/pep-0442/

I think it's still true at least on Python 2 that if an object with a
Python-level __del__ method is involved in a reference cycle, then the
cycle still has to be broken manually.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Simon King
2018-12-05 14:30:00 UTC
Permalink
Hi Erik,
Post by E. Madison Bray
Has that changed? I recall that in my early days in Sage (when I created the first version of my group cohomology package) I had to remove some __del__ method (or change it into a __dealloc__ method), since otherwise some reference cycles haven't been collected at all.
There have been some changes in this area, but mostly focus on Python
3: https://www.python.org/dev/peps/pep-0442/
I think it's still true at least on Python 2 that if an object with a
Python-level __del__ method is involved in a reference cycle, then the
cycle still has to be broken manually.
Thank you! Since Sage still uses Python 2, that issue thus is still
relevant.

Best regards,
Simon
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Volker Braun
2018-12-05 15:50:39 UTC
Permalink
Right, automatic clearing of circular references is only in Python 3...
Post by Simon King
Post by Simon King
Post by Volker Braun
Post by Simon King
- If there is a reference cycle involving one instance with a __del__
method, then Python would not apply garbage collection to that
cycle.
Post by Simon King
Post by Volker Braun
Python will garbage collect the cycle by not calling __del__ on some of
the cycle members.
Post by Simon King
Has that changed? I recall that in my early days in Sage (when I created
the first version of my group cohomology package) I had to remove some
__del__ method (or change it into a __dealloc__ method), since otherwise
some reference cycles haven't been collected at all.
There have been some changes in this area, but mostly focus on Python
3: https://www.python.org/dev/peps/pep-0442/
I think it's still true at least on Python 2 that if an object with a
Python-level __del__ method is involved in a reference cycle, then the
cycle still has to be broken manually.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jeroen Demeyer
2018-12-05 16:27:00 UTC
Permalink
Post by Volker Braun
Right, automatic clearing of circular references is only in Python 3...
I would phrase that as: properly dealing with __del__ is only in Python 3.

Luckily, __del__ is used only very rarely in Sage.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
s***@gmail.com
2018-12-05 14:47:00 UTC
Permalink
Thanks for explaining it in detail. I used to have to chase large amounts of memory leaks and your discussion reminded me of that complex chase every two to three months which had to fit in our software release cycles.

Sent from my iPhone
Post by Simon King
Hi Steven,
Post by Steven Craighead
How difficult is it to create a stack that can control the order of objects
being created and destroyed so you prevent leaks? Can you add a new method
on your base class that is inherited to all children to track this?
One basic principle in the development of Sage is: "Do not reinvent the
wheel, but build the car."
So, we use mainstream programming languages (Python for user
interface, Python and Cython for the Sage library), and we use
existing software packages, partly from the Python ecosystem,
partly stand-alone, such as Singular. These third-party packages can of
course be written in all kinds of languages, such as fortran, C++, Java,
etc.
1. A leak in a third-party package. In that case, we send a bug report to
upstream, possibly fixing it by a downstream patch until the bug is
fixed upstream.
2. A bug in Python's cyclic garbage collection. I don't know if we ever
stumbled over such bug, but in principle it is a possible cause.
3. Constructions in the Sage library that prevent Python's cyclic
garbage collection from properly working.
This thread is about yet another instance of cause 3, or rather about a
strategy to generally prevent instances of cause 3.
What prevents cyclic gc from working?
- If there is a reference cycle involving one instance with a __del__
method, then Python would not apply garbage collection to that cycle.
If I recall correctly, there has been a Sage memory leak in the past where
the existence of a __del__ method was the underlying cause.
- For performance reason, Sage does a lot of caching, and the caching
involves objects that typically have lots of cross references. See,
for example, the coercion system. Note that coercion in Sage is not
the same as coercion in C. A coercion in Sage, in first approximation,
is a canonical morphism in a mathematical category. If caching is done
improperly, then there may be some external reference to a reference
cycle, which of course means that Python wouldn't (and shouldn't)
garbage collect that cycle.
In order to prevent external strong references to a reference cycle, we
often use weak references for caching.
But that needs to be done with care: If there are two many weak
references, then stuff may be garbage collected although we wanted it to
be cached, and also following a weak references is slower than following
a strong reference; but using too few weak references creates a memory
leak.
I believe it would be an extremely bad idea to try and implement our own
cyclic garbage collection, disabling Python's.
Best regards,
Simon
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Simon King
2018-12-05 13:54:36 UTC
Permalink
Hi Jeroen,
Post by Jeroen Demeyer
I am studying the coercion model in detail, looking for optimization
opportunities. One source of slow-down is the use of weak references.
Over time, more and more places in Sage use weak references. But I'd
like to look at the big picture and see where weak references should be
used and where not.
Take coercion maps for example. The coercion model stores a weak
reference to the coercion maps and the maps also store a weak reference
to the domain (but not the codomain).
It's not clear to me why this double weak reference is needed, but maybe
I'm missing something. It seems more logical to use strong references in
the coercion map but then store a weak reference to the map.
I was involved in the development of the weakly cached coercion system, but
I am afraid I don't recall all the rationales behind the construction.
Let's try to explain anyway...

First, I summarise how currently coercion data is stored:

1. Each parent has a cache of coercion maps for which the parent is
codomain. The cache uses a hash table (MonoDict) with a weak reference to
the domain (the key) and a strong reference to the coercion map.
2. In some cases, coercion between two parents P,Q involves the creation
of a new parent R, such that both P and Q coerce into R. That's a
complicated construction, therefore the result is cached. This cache
currently is a global hash table (TripleDict), with keys being P and Q that
are weakly referenced. There is a weak reference to both coercion maps (P
to R and Q to R).
3. A similar scheme is used for actions. Here, in addition, the operator
(operator.mul, operator.add etc) is used as key.

How does that prevent memory leaks? Let there be a coercion map f from Q to
P, and maps gP from P to R and gQ from Q to R. Let C be the global cache.
By => resp. -> I mean a strong resp. weak reference.

- We have
- C->gP, C->gQ, C->Q, C->P
- P=>f, P->Q
- f=>Q, f->P, fP=>R, fQ=>R, fP->P, fQ->Q
- R=>fP, R=>fQ, R->P, R->Q
- There thus is a reference cycle involving a coercion map and its
codomain. All other references in the above graph are weak.
- If there is no external strong reference chain from a global object
to P, then the pair (fP,fQ) is removed from C, and P together with f will
be collected.
- If there is no external strong reference chain from a global object
to Q, then the pair (fP,fQ) is removed from C and f is removed from P's
cache of incoming coercion maps.
- Mild problem: If there is an external strong reference to, say, f,
then it is possible that Q becomes garbage collected anyway, and we would
end up with an invalid map.

I was just drawing the above graph on a napkin, and if I see that
correctly, changing ANY weak reference of the above graph into a *strong*
reference would create a situation where an external strong reference chain
from a global object to one parent would extend *internally* (i.e., inside
of the coercion system) to *another* parent (or it would introduce a strong
reference chain from the global object C to some parent) --- and that's a
memory leak.

So, Jeroen, I guess changing some weak references into strong references
won't work. But I'd be happy to stand corrected.

Best regards,
Simon
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jeroen Demeyer
2018-12-05 16:43:03 UTC
Permalink
o Mild problem: If there is an external strong reference to, say,
f, then it is possible that Q becomes garbage collected anyway,
and we would end up with an invalid map.
That is one of the things that I would like to fix: maps and actions
should have strong references to their (co)domains. On the other hand,
the coercions and actions should be sufficiently weakly referenced to
prevent memory leaks but not too weakly referenced to keep them alive
when needed. This will rely on #26811.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Nils Bruin
2018-12-05 17:33:50 UTC
Permalink
Post by Jeroen Demeyer
o Mild problem: If there is an external strong reference to, say,
f, then it is possible that Q becomes garbage collected anyway,
and we would end up with an invalid map.
That is one of the things that I would like to fix: maps and actions
should have strong references to their (co)domains. On the other hand,
the coercions and actions should be sufficiently weakly referenced to
prevent memory leaks but not too weakly referenced to keep them alive
when needed. This will rely on #26811.
I think this should work fine for "standard" coercion maps A->B, where B is
an object constructed from A, so A is referenced by B anyway. But there we
didn't need weak references to the domain anyway.

I suspect there might be problems for coercion maps c: A->C, where C is a
longer-lived structure (SR, Qbar, or CC for instance). Presently, "c" is
stored on C. It has to be stored there with a strong reference, to prevent
c from being collected. If c carries a strong reference to A, then now the
life span of A is bounded below by the lifespan to C.

I think you were thinking of limiting the life span of c via referencing it
via a "MultiWeakRef" from both A and C. However, with the scenario above, A
has its life span already bounded below by C anyway, so the "MultiWeakRef"
never gets to work its magic.

(it's nice to see a memory leak that for once does not come from the
UniqueRepresentation cache)

Again, it's ugly, but carrying a weakref to the domain (or no reference to
the domain at all!) is not fundamentally a performance problem for coercion
maps: Whenever a coercion is called, we already know what the domain and
codomain are, because that's how we looked up the coercion in the first
place! It may be a performance problem in setting up coercions, because of
all the shenanigans with stripping off the domain, and it's easy to write
code that does have a penalty if you don't explicitly make sure to use your
a priori knowledge of the domain.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jeroen Demeyer
2018-12-05 21:52:58 UTC
Permalink
Post by Nils Bruin
I think you were thinking of limiting the life span of c via referencing
it via a "MultiWeakRef" from both A and C. However, with the scenario
above, A has its life span already bounded below by C anyway, so the
"MultiWeakRef" never gets to work its magic.
With my idea, the domain and codomain would be treated exactly the same:
my plan is to reference the map from both the domain and codomain in a
symmetric way. In that way, the lifetime of A wouldn't depend on C the
same way that the lifetime of B or C doesn't depend on A.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Nils Bruin
2018-12-06 00:36:32 UTC
Permalink
Post by Jeroen Demeyer
my plan is to reference the map from both the domain and codomain in a
symmetric way. In that way, the lifetime of A wouldn't depend on C the
same way that the lifetime of B or C doesn't depend on A.
Then you DO get a memory leak as soom as UniqueRepresentation objects are
involved. Let's assume that B is constructed from A. Then a strong
reference to A exists because A occurs as a key in the UniqueRepresentation
weakvalue dict. Because the coercion map from A to B is also stored on A, a
strong reference to B exists. As long as A and B are alive, the coercion
map is alive as well, so these references don't change.

A really has its life time lower bounded by B. So somehow, you'd want B to
be collectible under some circumstances. But then the coercion map needs to
be "unreachable" as well, because it carries a strong reference to B.

Conversely, as long as B is alive, A is also alive, so the coercion map
must be reachable. So it seems to me B should just have a strong reference
to the coercion map. Furthermore, A should not have a strong reference to
the coercion map, because otherwise A and B are immortal.

It doesn't look to me as if a MultiWeakRef is going to help at all with the
lifetime problems here.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jeroen Demeyer
2018-12-06 07:06:50 UTC
Permalink
Post by Nils Bruin
Because the coercion map from A to
B is also stored on A, a strong reference to B exists.
No. The map will be referenced both from A and from B with one of those
references weak and one of those references strong, without specifying a
priori which one is strong and which one is weak.

That's the point of MultiWeakref: as long as we're not garbage
collecting, it doesn't matter which reference is strong and which one is
weak: only the numbers matter, since that determines the refcount. When
garbage collecting, the tp_traverse algorithm of MultiWeakref decides
dynamically which reference to consider weak and which strong in a way
to maximize the amount of garbage that can be collected.

So let's assume that it's weakly referenced from A and strongly
referenced from B. (The map will also be referenced from the coercion
model, but always weakly; this won't change). Then we have a strong
reference cycle from B to the map and back, which does not prevent
garbage collection. So, if nothing else holds a reference to B, then
both B and the map can be deleted.

Anyway, that's the idea. I haven't worked out all the details yet...
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Simon King
2018-12-06 07:35:00 UTC
Permalink
Hi Jeroen,
Post by Jeroen Demeyer
o Mild problem: If there is an external strong reference to, say,
f, then it is possible that Q becomes garbage collected anyway,
and we would end up with an invalid map.
That is one of the things that I would like to fix: maps and actions
should have strong references to their (co)domains. On the other hand,
the coercions and actions should be sufficiently weakly referenced to
prevent memory leaks but not too weakly referenced to keep them alive
when needed. This will rely on #26811.
Then what will be your reference graph? Or phrased differently:
Where/how will you store coercion maps? That isn't indicated on the
ticket.

Best regards,
Simon
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jeroen Demeyer
2018-12-06 10:15:03 UTC
Permalink
Post by Simon King
Where/how will you store coercion maps?
The basic idea is the following (I have not worked out all the details yet):

* The coercion model only stores weak references to anything (domains,
codomains, actions, maps).

* Coercion maps store strong references to the domain and codomain.

* The domain and codomain store MultiWeakref references to the map,
where one of those references is weak and one is strong.

* Analogously for actions, with acting set/underlying set instead of
domain/codomain.

It might also make sense to consider pairs of coercion maps specially,
in cases where arithmetic with A and B gives a totally new parent C and
we have maps A -> C and B -> C. I haven't thought about this yet.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Simon King
2018-12-06 13:07:24 UTC
Permalink
Hi Jeroen,
Post by Jeroen Demeyer
* The domain and codomain store MultiWeakref references to the map,
where one of those references is weak and one is strong.
And if I understand correctly what you said in another post, it is
*dynamically* determined which reference is weak and which reference
is strong. When is it determined? During cyclic gc? Does that mean that
you change the innards of Python's cyclic gc? Or is it possible without
patching Python?

Hopefully it is explained in #26790; I'll have a look...

Best regards,
Simon
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Simon King
2018-12-06 13:09:19 UTC
Permalink
Post by Simon King
Hopefully it is explained in #26790; I'll have a look...
Sorry, must be another ticket. #26811, I guess.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Jeroen Demeyer
2018-12-06 13:17:21 UTC
Permalink
Post by Simon King
And if I understand correctly what you said in another post, it is
*dynamically* determined which reference is weak and which reference
is strong. When is it determined? During cyclic gc?
Yes, during GC: that's the only time where it matters. More precisely,
it is determined during every tp_traverse loop.

With a "tp_traverse loop", I mean a loop of the form

for obj in set_of_objects:
type(obj)->tp_traverse(obj, visit, arg)

where "visit" is constant during the loop (arg does not matter).
Post by Simon King
Does that mean that
you change the innards of Python's cyclic gc?
No, nothing needs to be patched. It's just a clever way of implementing
tp_traverse(). I am making assumptions on how GC works internally: in
particular, I am assuming that it's OK to dynamically change the
reference graph as long as it's consistent during every individual
tp_traverse loop.
Post by Simon King
Hopefully it is explained in #26790; I'll have a look...
That's #26811 but that's work in progress.
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+***@googlegroups.com.
To post to this group, send email to sage-***@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive:
Loading...